(This is originally a chapter from the book Efficient information searching on the web.)
The eight steps of the search process
The researcher Gary Marchionini has produced a model of the information-seeking process for computer-based searches. The model consists of eight phases and the transitions between them are divided into three different types depending upon how common they are.
- Recognize and accept an information problem
- Define and understand the problem
- Choose a search system
- Formulate a query
- Execute search
- Examine results
- Extract information
The arrows in the model illustrate how complex the search process is. We constantly move from one phase to another when we seek information.
Fig. The search process for electronic information seeking according to Gary Marchionini.
1. Recognize and accept an information problem
The process starts with the recognition and acceptance of an information need. In many cases we don’t accept the information needs, i.e. we don’t bother about searching for information in the subject since we don’t find it worth the trouble.
2. Define and understand the problem
To narrow down a subject and formulate it as a problem (a sentence with a question mark at the end) is an important part of the search process which is easy to neglect. Once you have the problem formulation you also have a couple of search words to start out with. The problem formulation saves a lot of time and it’s easy to go back and look at the problem formulation during the search to keep the search on the right track.
Besides formulating the search subject as a problem you should also try to answer the following questions as early as possible in the process:
- Which are the central aspects? Identify important concepts or keywords.
- What is the information meant for? Which is the level of ambition?
- Must the information be updated?
- How extensive does the information collection need to be?
- What previous knowledge do you have? Do you have to spend time on studying up on the subject?
- Which languages and geographical areas are of interest?
- Is scientific material required?
- How much time can the search take?
Be clear on any possible limitations in time, languages and geography. These limitations are also of importance to your choice of search service.
3. Choose a search system
Is the Internet the right place for searching, considering your subject? Is the sought information perhaps local or old? In that case the information is far from always on the Net. Is the information completely new? If so, it might not yet be published, or it may be hard to find in the traditional search services.
Would it be faster to find the information in another way? Pure facts can be easy to find in ordinary encyclopaedias. You often spend more time and energy on searches than you would have if you had visited the nearest library. Learn what you may expect to find on the Net and what normally won’t show up.
The type of search that you are about to perform matters to your choice of search service. For an exhaustive search, where ”all” information in the subject is the goal, many search services must be used. No individual search engine or directory covers more than one part of the Web and all of them hold unique resources.
If you want to get an overview of the central resources in a subject you should use the large Web directories. These ”virtual collections” provide a very good starting point when you begin looking for information in a broad subject.
If you have a specific, delimited query you should use the search engines. For the monitoring of new resources in a subject you can use different news services or alerts in a search engine.
Before you choose your search service you should do some thinking about the subject and what you’re trying to find. Never rely on only one service or Web site.
4. Formulate a query
The possibility of getting good search results increases the more you prepare your search query. To do a search with just words, without conditioning the search, will often give hit quantities with the relevant hits mixed with so much garbage that you will be giving up before having gone through just a fraction of the hits.
Formulate the search query as clearly as possible. Sort it into interesting and uninteresting aspects.
Analyse the search query. Pick out central keywords and phrases. Look up synonyms, alternative spellings, abbreviations and acronyms which you can use in the search.
Use articles, books and other resources that you know about in order to find new search words.
5. Execute search
The phase can be described as ”the interaction with one or several information systems”. In practice this means that you enter the search query in Google and push the search button, you call the admittedly knowledgeable professor or you look in the books in a library.
6. Examine results
Is the query answered or the goal fulfilled? If not, what remains and how will you get there? Perhaps the search needs to be modified or the goal/query revalued. This may lead to a change of search strategy:
- Broaden the search?
- Narrow the search?
- Change search service?
- Look for another type of material?
- Go off-line? Go to the library’s physical collections or contact a specialist in the subject?
7. Extract information
To extract information is the goal of the search process. The collecting on the Web is often done ”on the go”; you pick up some here and some there. When it comes to searches in databases the collecting is done much more systematically, perhaps by means of your library.
Is the search done and the information need met with? What lessons can be learned before the next search? Sometimes you have to start from the beginning, i.e. redefine and reformulate the problem.
Modes of action, strategies and tactics
On the Net and in books and articles on information seeking you’ll find many terms for different ways of executing a search. The most common denomination is strategies (variants are search strategies or information seeking strategies), but tactics are also used at times. It may be good to bear this in mind if you want to read more about the subject.
The information need and the search query
An information need can look many different ways. Perhaps you need to get an overview of a certain subject or perhaps you only have to check a fact. Having to update yourself in a subject can also create an information need, just as a need of specialization will.
Irrespective of if you are preparing a major purchase, e.g. of a car, or if you have an assignment for school, you often need more information and you are forced to search for information. Each information need requires a certain type of information. And different kinds of information require different modes of procedure.
The standard procedure for clarifying an information need is to try to write it down as a question. When the question is formulated it’s normally possible to refine and distil its components. Well formulated questions can more often be answered than less well formulated questions.
By reading up on a subject, e.g. in an encyclopaedia, you can often find central aspects, useful terms and words to search on.
Search queries that correspond to information needs can be divided in different ways. A basic division is constituted by self-limiting respective open search queries. The self-limiting queries have a specific answer, and in most cases you’ll know, during the course of the information seeking process, whether you have found the answer or not. Open queries, however, can’t be said to have an answer, but are answered with sufficient information.
Five query types, of falling magnitude, when it comes to scope:
What is there to be found in a specific subject/what types of resources exist? Example: Different types of resources about alcohol. Resources that deal with alcohol: kinds, sales, drinks, harmfulness…
Queries on general information/broad subjects. Example: Resources that deal with the Middle Ages.
Delimit/expand the query
The transition between general and specific queries, i.e. queries that deal with the delimitation or expansion of a subject.
Queries on specific information in well-defined areas. Example: Information about the National Gallery in Prague.
Pure fact queries with a given answer. Example: What is the name of the mayor in New York City?
Catch the query
By putting your question into words, by formulating it, you’re forced to think about what you want. If you spend time identifying key phrases and visualizing the ideal answer you will have a much greater chance of recognizing that answer once you find it on the Web.
The question can perhaps not always be “caught”, but can often be narrowed down through considering the following before the search: Who? What? When? Where? Why? How?
Who’s important in your subject? There may be a known expert or organization that you should contact. Do you need to find someone with certain experience or knowledge? Think in several steps: Who knows the person who might be able to help you? Good places to start searching for organizations are directories or link collections. Experts can often be found through media resources like newspaper archieves.
What kind of information do you need? Information comes in many shapes: statistics, first-hand sources from an event, background information, specific facts or scientific articles. What would be the best source of information? What is the information meant for? Different kinds of information are needed for different purposes, like an analysis or a report, an overview of a subject or a confirmation of some factual information. By defining the genre of the information sought after it will be easier to choose search service and to pick the right search words.
When did the researched event take place? If you’re not dealing with a contemporary event you have to find sources that go back in time sufficiently. Databases hidden in the Deep Web or old web pages in the Old Web can be useful resources (Chapter 11).
Where did the researched event take place or from where did the researched person come? There is often more information locally about events or persons. Consider whether a geographical delimitation might be of interest.
Where may the question have been asked earlier? There are few subjects that have never been discussed earlier; there may be relevant information in newspapers, TV broadcasts, court documents, Web discussions and so on. Make use of what others have done earlier, don’t do all the research work from scratch if you can avoid it. Where are you likely to find the largest, most suitable collection of information? Large quantities of information can be found, e.g., in university libraries, in subject databases and in the daily papers and other media companies.
Why do you need to do the research? The reason may be everything from finding someone to interview or confirm certain facts to getting the hang of a big subject or completing a school assignment. Why? is often close related to What?, the purpose often defines what information you need to search for.
How much information do you need? The need may stretch from a separate fact to everything that a subject comprises. The information need may determine what kind of search service to use, e.g. a search engine or some kind of a directory.
How will you use the information? Different requirements are placed on information that is to be used printed (newspaper or book) compared to a wedding speech. And how up-to-date does the information need to be? The degree of newness determines the way of searching. Topical information may be hard to find in the search engines. Old information can be hard to find in electronic forms, but check the Old Web.
Formulate the search query
To formulate the search query in a search engine you can follow the steps below. The point is to define exactly what you’re looking for, the ideal Web page, and then design the search based on this ideal. The search query should reach the relevant Web pages for your subject, at the same time as pages not needed should be excluded.
1. Imagine the ideal Web page with all the information that you need.
Example: You’re looking for information about the Swedish author Jan Arnald who is one of the editors of the Swedish literature journal Aiolos. The ideal Web page would contain Jan Arnald’s entire biography, preferably with photographs.
2. Think of the words that would be on your ideal Web page.
Example: The ideal Web page would contain both the word Arnald and the word Aiolos. With Boolean logic the search query will be:
aiolos AND arnald
But since AND is preselected in the large search engines it is enough to enter:
3. Think of an exact phrase that exists on your ideal Web page, two or more words that follow upon each other.
Example: Instead of just searching on the surname you can search on the whole name as a phrase:
4. Think of the words that you don’t want to form part of the search, words that lead to pages which are not of any use to you.
Example: Jan Arnald has also written detective stories under the pseudonym Arne Dahl and as he is better known as a detective-story writer many of the pages will probably be about the detective stories, and not about the literature journal Aiolos that you were interested in. Therefore we will exclude Arne Dahl from the search by means of NOT in the Boolean logic. In the search engines NOT is represented by a minus sign (-) before the word or phrase that is to be excluded:
5. Run the query.
Example: Enter the search query in the search engine and execute the search.
aiolos “jan arnald” -”arne dahl”
6. Improve the search.
In a typical search on the Net in a search engine you have to try over and over again. You will be improving the searching constantly as you learn more and more about the subject. New, more efficient search words are discovered and more aspects to remove from the search will be thought of.
Different types of search queries
There is not always a cut-and-clear information need behind search queries; they can instead be divided into three categories:
Information – the intention is to find information.
Navigation – the intention is to find a specific Web site for searching/surfing.
Transaction – the intention is to perform an activity on a Web site, e.g. Net shopping or look in a library directory.
Clear and unclear queries – Use directories for large queries, or when the query is unclear. And use search engines when you have a clear idea of what you’re looking for and when you have good search words, ready for use.
The formulation of the search query – Formulate the search query as clearly as possible. Sort it into interesting and uninteresting aspects.
Analyze the search query – Pick out central keywords and phrases. Find synonyms, alternative spellings, abbreviations and acronyms that you can use in the search.
Revalue the goal of your search – One search query is not enough even to very experienced Web searchers. One or two modifications are often required. And good searchers often revalue the goal of the search after the first search.
Alternative way – On the Web there are often several ways to reach the imagined destination. If it’s impossible to get ahead on the planned route you have to find another way to reach the goal.
Track Web pages – Use search engines to get back to pages earlier visited. By remembering fragments of the page, which provides you with search words, it’s often easy to track the page.
Execute the search using several services parallelly – Through using several search services at the same time you can, in an easy manner, make use of the information that you’ve found in a service to get on with your search. Use the retrieved information in the next service in order to find more relevant information.
Cut and paste- ”Cut and paste” makes information seeking with computers more efficient than searches in old card catalogues and large bibliographies (books about books).
The Net dealers – Use the online stores, like amazon.com. They contain large quantities of information. They also often contain tips about similar products or services and comments from other users.
Find new search words – Use articles, books and other resources that you know of to find new search words.
Self-publishing – The Net is a medium for self-publishing. Anybody can publish practically anything. Everything found on the Net needs to be examined and analyzed before it can be used.
Not just Google – Don’t use only Google. Google is fantastic, but there are lots of other useful services. Through other services you can find information which won’t show up in Google’s hit lists.
Different search routes – The three most important routes to information on the Web are the search engines, the Web directories and awareness about the Deep/Invisible Web. They are useful for different types of search queries, so make sure that you understand the differences.
 Marchionini, Gary, Information Seeking in Electronic Environments, 1995.
 Marchionini, Gary, Information Seeking in Electronic Environments, 1995.
 The division comes from Johanna Nilsson’s Master’s Thesis Informationssökning på Internet – att välja verktyg [Information seeking on the Internet—choosing tools] (1998).
 Broder, Andrei (2002) A taxonomy of web search [http://www.sigir.org/forum/F2002/broder.pdf].