(This is originally a chapter from the book Efficient information searching on the web.)
Several problems present themselves when searching on the Net. One that is becoming increasingly bigger is information overload; you become overloaded with information and perhaps you feel more like giving up than to start unraveling the information. Another problem is that the Web in many ways is a lawless country – anyone can publish just about anything. The differences are big compared to the books in a library which have normally undergone various examinations: the publishing editor, the fact editor and finally the librarian who buys them. Not for nothing, www is at times said to stand for the Wild Wild Web.
The overload on the Web can be of all kinds. Which search service should I choose (Search Engine overload)? Help, I’m drowning in all the ads (Advertisment overload)! Sometimes it just might be too much with all the choices you have to make on the Web and all the undesired things you have to deal with. Some things to consider:
Search engines search on words, not concepts
The search engines match the entered search words with the contents on the Web pages that the search engine has indexed. No consideration is given to concepts; the matching is mechanical.
Garbage in, garbage out
The best search can never be better than the contents on the Web pages that are accessible on the Web. For certain subjects there may not be any qualitative resources on the Web, and then the search services can’t deliver anything that is good.
No editors clean out
On the Web there is nobody who cleans or throws away things. What’s put up on the Web remains there as long as the Web server, on which it’s accessible, is running and nobody actively removes it. In modern Web-publishing systems you can oft en decide that a Web page will be inactive if nobody has checked it after a certain amount of time.
Know your limits
Broaden your outlooks on the Web slowly but surely. Try new search services and search on new subjects, but not on places that you can’t relate to or control. Go ahead and dive deeply into the Web sites of authorities and other organizations, but be more careful with commercial or private Web pages.
Assessment of Web pages
All Web pages have two things in common: all have a sender and all have a purpose. This leads up to two questions:
- Why – which is the purpose of the Web page?
- Who – who is the sender of the page?
Why was the information published? Web sites can be divided into the following categories:
- Business activity
Is the purpose spelled out? Which is the purpose of the page? Which opinions are brought up on the page? Sometimes the purpose is clear, but if you’re uncertain you can look under ”about us” or ”about the Web site” which is often located in the top menu, the left menu or in the foot of the page.
Who published the information? What do you know about the sender?
- Who wrote the page?
- What qualifi cations does the Web page’s creator have?
- Who published the page?
- Who’s responsible for the Web site?
- Is there any contact information?
To consider for the assessment
- Is the information up to date and complete?
- Is the information presented in an objective way?
- Is the document well written?
- When was the page published?
- How oft en is the Web site updated?
- Are there sources or references?
Traditional source criticism
The traditional source criticism builds on four principles and one distinction. The criteria are:
- Authenticity. Th e source should be what it claims to be.
- Relationship in time. Th e distance in time between the event that the source describes and the origin of the source is important from the reliability point of view.
- Independence. Th e source should not be a copy or a summary of another source.
- Freedom of tendency. Th e source should not be a party of anyone or anything. If the creator of the source has an interest in the case there is a big risk that the source is biased.
In traditional source criticism a distinction is made between narratives about something and the remains of something. A written agreement, for example, is a remnant of an agreement, but the agreement may, naturally, also narrate something.
Source criticism for the Internet
To the four traditional criteria described above it is possible to add three more criteria, adapted to the information on the Internet (1):
- World picture and approach to knowledge as bias
- Prerequisites and characteristics of the source
World picture and approach to knowledge as bias
All sources could be said to be biased, even if they don’t form part of the case. All sources are products of the culture in which they were created and in which they exist. Cultures consist of religious ideas, traditions, history, language, customs, ideals and laws, and all this together can be summed up to a world picture.
To assess a source you have to determine its world picture. Via the Internet you rapidly and simply reach information from every corner in the world and from widely diff ering world pictures and you constantly have to take your stand in relation to biased information.
Th e huge amount of information means that you all the time have to drop sources and only use some of all the hits you’ve got. But which should you pick? You may consider:
- Information about the creator: name, title, position, organizational affiliation, date of origin and contact information.
- Which domain? Commercial .com or .edu (US education) says something about the purpose of the Web site. Swedish .se doesn’t mean a lot as it isn’t restricted to any special category but accessible to everyone from authorities and companies to separate individuals.
- The address of the Web page. If the Web address is simple and without sub-directories this generally means that the Web page is the fi rst page of the Web site, e.g., www.omis.se.
- Private Web address? Th e address www.student.lu.se/~ft e99jfr/bib052/ says that it belongs to a student at Lund University. It should perhaps not be considered to be particularly reliable, unless you’re one of the students at the course BIB052, the Invisible Web, where the Web page is used.
Prerequisites and characteristics of the source
No sources are perfect. Sources make mistakes (without wanting to delude or distort). Sources can’t keep up, but have to restrict themselves or they don’t work technically as expected. When you want to use a source you should get an idea of its prerequisites and characteristics. What is true of the source? Under which prerequisites does the source work? Who puts the information into the database and what is not included?
Enhanced source criticism – media knowledge
Today, or in any case when we’re speaking of the Internet, it’s become more relevant to talk about media knowledge. All knowledge about the conditions that apply to Net publishing will help you to critically assess what you fi nd. Another important part is the conditions that control the search services that you use to get the information. Who selected the links in the Web directory? Why is this hit the number one hit in the search engine’s hit list? Is it possible to see which of the links are ads?
“False” Web sites
RYT Hospital – Th e Dwayne Medical Center (www.rythospital.com). Th e Web site belongs to a fi ctitious hospital where, among other things, they have succeeded to bring about male pregnancy…
The Dihydrogen Monoxide Research Division (www.dhmo.org). The Web site has the same style as many pseudo-scientific sites. Th e information is hard to assess if you don’t know anything about the subject.
Several Wikis use the Wiki soft ware Mediawiki (freely accessible at www.mediawiki.org), among others the Wikipedia (http://wikipedia.org). But many other wikis also have the same appearance as Wikipedia, precisely because they use the same soft ware. No other reson.
Information in, for example, books, journals and (pay-) databases has often passed through various steps of quality control. Th ese steps may consist of publishers, editors, fact editors and librarians who in various ways review and assess the information. On the Internet it’s easier to publish information and only a small part of the information is reviewed thoroughly.
The target group of the page
The Web page’s target group? Laypersons or specialists? Children or adults?
The address of the page
Can you through the Web address, the URL, know anything about the page? Today anybody can buy an .se-address (the Swedish top domain) which is why it’s not any longer a sign of quality in itself. Company- or organization names may be a sign of quality, especially if the name constitutes the entire domain name, e.g., www.volvo.se. leads to the offi cial Web page of the Volvo group, while www.volvoforum.de leads to a German discussion forum for Volvo owners.
Longish addresses with a tilde (~) in the Web address indicates a private Web page. These often lie under a big Web actor like AOL or Yahoo! where it’s often free to put up Web pages. It’s also common for university colleges and universities to give staff and students their own space for publication of private pages. The contents, thus, don’t need to be related to the person’s subject studies or research area. Just because a person is the professor of a subject he or she is not an authority in other subjects.
Don’t let yourself be fooled by addresses that are similar. To use a Web address that looks like a known and respected address is oft en a way to trick someone who has misspelt the address. It can also be a way for people who have another opinion in the subject to present this on the Net. Other people have as their purpose to make visitors enter through misspellings and with that be able to expose commercials or sponsored links. The Web site is paid for by the display of advertisements, so you could say that the business concept here is to make use of surfers who are lost and who might not discover this immediately.
Trace the owner
Trace the owner of a Web site via a WHOIS search service. WHOIS search services are used to fi nd out who the owner of a domain is. How much information you get depends upon how the address is registered and under which domain. Sometimes you will get both the name and address to the owner.
(1) Leth, Goran & Thurén, Torsten (2000). Källkritik för Internet [Source criticism for the Internet]. Stockholm: Styrelsen for psykologiskt försvar [http://www.psycdef.se/Global/PDF/Publikationer/kallkritid for internet.pdf]