5. Choosing Search Service

(This is originally a chapter from the book Efficient information searching on the web.)

Searching on the Web?

Why information searching on the Web is popular:

  • You do it yourself. You don’t need to be entirely sure of what you’re looking for, in contrast to when you have to ask someone who you don’t know very well (e.g. a librarian).
  • Easier to get going – you enter something and you almost always get a result.
  • The sensation of uneasiness that characterizes the first phases of the information seeking process is rapidly reduced, since you have, in any case, found something.
  • You get a feeling of knowing what you’re doing. Your self-confidence increases as you have found certain things and thus don’t need any help.
  • You get the idea that you’re dealing with a source, since you can, practically speaking, find just anything with the help of a search engine. Most of the time you don’t need to understand how different databases function.
  • It’s fast and easily accessible (round the clock, the year around), also from mobile phones.

But there are also many emotional disadvantages: insecurity and hesitation, confusion, frustration and time pressure. The searching on the Net often takes more time than what you had imagined. The false roads are many. Wilfing is a new term for when you have gone astray and start thinking ”what was I looking for?”.

Another problem is “information overload”; you become flushed with information and can’t handle it all. When a search results in a million hits you may perhaps give it up and look for an encyclopaedia instead. And how do you know when you’re done with the search? When you have found enough or everything?

Much information is still not out on the Net but must be sought for in physical books and journals. Is the Net a possible information source for you?

Many different ways of searching lead to information on the Web, but some of the ways are efficient and others are less efficient. In the same way different search paths lead to different kinds of information. And different search methods lead to better or worse hits. Really inefficient searches lead to a lot of noise, i.e. lots of irrelevant information, or, if the worst comes to the worst, to no productive result at all.

Directory or search engine?

Directories and search engines are nearly each other’s opposites. The seven points below illustrate important aspects of the search services.

Directory Search engine
Data collection manual automatic
Contents qualitative quantitative
Search method in a hierarchic subject structure using search words
Result all links in the category estimated relevance
Size naturally small unlimited
Size today (largest) around 5 million (Open Directory) more than 20 billion
Comparison with book table of contents register (index)

The data collection in a search engine takes place automatically by means of spiders, but in the directory an editor places the link. At the same time, the editor makes a quality selection, something which doesn’t exist in the search engine.

The links in the directory are placed in a subject structure and found together with the other links in a given category. In the search engine you can’t navigate a subject structure but have to look for relevant pages using search words which the search engine then ranks.

The directory doesn’t have a maximum size, but is entirely dependent upon the editors having the time to add links at the same time as the already placed links need to be checked regularly. In the same manner, the search engine doesn’t have any upper limit for its size, but here are limitations as well. The management of a large index requires more computer power, computers cost money and in the end it’s all about finances. The idea is also for the seeker to quickly get the hit list once the search is done. A too large index can result in a too long wait and the seeker will then make a change of search service.

Should you use a search engine or a directory service? The answer is often both but below you find some guiding principles.

When should I use a search engine?

  • When you have a small or obscure subject or concept that you need to search for.
  • When you’re looking for up-to-date information in a subject that you’re familiar with.
  • When you’re looking for a specific Web site.
  • When you want to search the entire text of millions of Web pages.
  • When you want to get a large number of documents in your subject.
  • When you’re searching for a special type of document or a certain file type.
  • When you want to search a certain Web site.
  • When you want to use a newer search technique as, e.g., clustering of results or link analysis.

When should I use a directory?

  • When you have a broad subject or concept that you’re searching for.
  • When you want a list of Web sites, of immediate interest to your subject. Sometimes the Web sites are also described or commented.
  • When you’re in the beginning of an investigation of a subject. Use a directory to find search words and phrases that can be used in other search services.
  • When you don’t know exactly what you’re looking for but would recognize it when you see it.
  • When you want to look around in a checked environment.
  • When you’d rather get a few Web sites than lots of separate Web pages.
  • When you want to avoid documents of small contents, which often show up in search engines.
  • When you’re searching for titles, links, comments or keywords to relevant material instead of the full-text material.

Below you find some examples of subjects and the suitable type of search service to start the search in. The subjects are commented after the table.

Directory Search engine
Swedish literature Astrid Lindgren
Second World War the transit traffic through Sweden
Space journeys Christer Fuglesang

Astrid Lindgren is the author of the books about Pippi Longstocking, Karlsson on the Roof, etc. The transit traffic through Sweden: During the Second World War German soldiers were transported on the Swedish railroad to Norway, which was occupied by Germany, in spite of the Swedish neutrality. Christer Fuglesang was the first Swedish person in space. He is a researcher and an ESA astronaut.

In short: General subjects in a directory and specific subjects in a search engine.

Assessing search services

Search services are extremely changeable. The technique is improved/changed, functions are added or disappear, services are bought by competitors and turned into shells for other search services (as Altavista has become to Yahoo!). Practically all search services suffer from bad documentation. Many features are not mentioned in the help texts and the texts are very general, particularly when it comes to ranking and sponsored links.

What should you look for when you assess a search service?

Test the search possibilities and the capacity of the service that you are about to use. Read the help- or tips page and skim over the FAQ (if there are any).

Search engines

  • How many Web pages does the index contain?
  • Are the bought/sponsored hits placed in the hit list or outside of it?
  • Does the index contain many links that don’t work (dead links)?
  • What does the hit list look like?
  • Are the hits relevant?
  • How are the results presented?
  • Are there sponsored links? How are they presented/marked out?
  • What is the coverage? How large is the index?
  • Where does the index come from?

Directories

  • How many links does the directory contain (large or small)?
  • Who is behind the directory (if it’s not financed by advertising)?
  • Does the directory include the search subject of interest to you?
  • Is the directory maintained (dead links or old links)?

Meta search services

Which search services does the meta search service use for its searching?

  • Search engines
  • Directories
  • Advertising services
  • Other (e.g. Wikipedia)

What does the hit list look like?

  • Are the hits relevant?
  • How are the results presented?
  • Are there sponsored links? How are they presented/marked out?
  • From which search services do the hits come from?

What search possibilities exist?

  • Which Boolean expressions support the service? (AND, OR, NOT, +, -)
  • Can you truncate?
  • Is phrase searching supported?
  • Is field searching supported?
  • Can the results be limited to a domain or a Web site?
  • Can the results presentation be varied in any way?
  • Is it possible to limit or revise the search?
  • Is there any help for the search expression formulation (advanced searching in, e.g., Google is really a type of help, making the search syntax accessible)?

Comparing searches in different search engines

Thumbshots Ranking (http://ranking.thumbshots.com) is a service which compares the first hundred hits for different searches in some search engines. The following search engines can be compared in this service:

  • Alltheweb (the Yahoo! index)
  • Altavista (the Yahoo! index)
  • Google
  • MSN (Bing)
  • Yahoo!

It’s also possible to compare different search strings in the same search engine. The example below shows two searches in Google on the closely related terms invisble web and deep web.

Fig. Comparison between ”invisible web” and ”deep web” in Google

The overlapping between the searches in the example is only 17 per cent and the differences between the placings in the hit lists (hit 1-60) are shown with the lines. See also comparisons in the chapter about search engines.

More things to consider when choosing a search service

Search limitations – Be clear on any possible limitations regarding time, language and geography. It matters for the choice of search service.

Specific query- If you have a specific, delimited query, use the search engines.

Circling around – A Web directory allows you to get a look at the subject and then make your delimitation.

Leave a Reply

Your email address will not be published. Required fields are marked *