14. Storing & Retrieving Information

(This is originally a chapter from the book Efficient information searching on the web.)

During and after the information searching you will enter into another area of problems – the personal information management. How should you store that which you have found? Th at which is found can be everything from a link, an image or factual data to entire Web sites or reports in PDF format. There are many modes of procedure and a lot of useful services on the Web. But a central question needs to be answered fi rst: How will the information be used?

For immediate use

Are you going to use the information directly? If you only need to read it through for background information it’s often best to print it on paper. Then you can go ahead and read it with a pen in your hand. But if excerpts and perhaps images are to be used in a report you need a different kind of management. Perhaps you need to save the material as PDF files.

Save/preserve/file away

If the information needs to be saved on a more long-term basis it might be good to create PDF fi les of the information. The PDF fi les are easily created by means of a PDF writer. The contents of the PDF fi les can then be indexed by desktop search service to make them easily retrievable. The files can also be stored in a traditional hierarchy of folders.

For forwarding

Is it somebody else who’s going to use the information? In that case it should perhaps be gathered or packaged in some way. A labour-saving way is to use a special program for this type of tasks.

For sharing with others

If several people are going to use the information a Web service may be useful. Perhaps you can store texts and links in a Web word processor or store documents in a Web hard disk.

Where should I save that which I have found?

The question which first arises is: Where should I save that which I have found? Locally on my computer or on the Web?

Locally on the computer

By saving the information on your own computer it will be quickly accessible, but only accessible on that computer. If the information is important or extensive, backup copies will have to be done, all computers can crash. Which format should the information be saved in? Only the bookmarks? PDF copies of Web pages?

On the Web

By saving the information on your own computer it’s made accessible for all the connected computers, but the Web can be slow and you need to have a reliable connection. An advantage with Web storage is that you get more possibilities to share the information. At the same time you have to be on the alert when it comes to copyright, it’s not allowed to make a saved Web page or image accessible on the Web without the copyright owner’s permission.

How should I save that which I have found?

In a bookmark service

Many bookmark services exist on the Web today. One type consists of the so called bookmark services where it’s easy to share bookmarks and to see the number of people who also have saved the same link and who these people are. The advantage is that the bookmarks are accessible from different computers; the only thing required is to log into the services. In most of the services you can place a subject word on the links that you save; it’s called to tag the links. By labelling the links they will be easier to find among all the bookmarks. A risk with free Web services is that the service may disappear, is taken over, is radically changed or is converted into a pay service.

  • Blinklist (www.blinklist.com)
  • Delicious (http://delicious.com)
  • Diigo (www.diigo.com)

When you save bookmarks in the services you fill in tags (subject words) and comments. When you’re logged in you can choose to see your own bookmarks or all the bookmarks in the service.

private bookmark

Fig. A private bookmark in Delicious.

A bookmark in Delicious is shown above. Among other things you can see that 157 persons have saved this bookmark and that it has three tags: seo, searchengine and history. You can see and search among your own tags in a tag cloud.

tag cloud

Fig. A tag cloud in Delicious.

Delicious and other social bookmark services are described from a searcher’s perspective in Chapter 3. You find more traditional bookmark services in Google and Yahoo! (www.google.com/bookmarks/ and http://bookmarks.yahoo.com). A list of bookmark services is found at www.feedbus.com/bookmarks/.

tabell bokmärkestjänster

Fig. Table of the function of social bookmark services.

Your own Web archive/link collection

One way of making your selected links easily available is to create your own link collection on the Net. A simple way is to put up the links you’ve found in a blog. But it’s only the links themselves that can be saved, you can’t save images or other material in this way on account of copyright (unless you password-protect the blog/Web page or make it completely private). Sometimes you’ll find scientific articles that some researcher has saved on his/her personal homepage to have within reach. The researcher has, perhaps, created a hidden link so that nobody will discover the page, but the search engine spiders are diligent and will follow all links. And with that the articles will become accessible in, e.g., Google.

On paper

The classic way is to print out Web pages and other important information. The disadvantages are that the information isn’t searchable in a simple manner, and you’ll need to create a physical fi ling system. If you’re dealing with news articles or other new information a good advice would be to print it directly, as the article may be gone the next time you want to look at it. The information is oft en filed in the newspaper’s archive which is available to subscribers and other payers. Information on paper is easy to read and available off -line for good and worse. You don’t need a computer and you can read actively with a pen in your hand.

Store links in the Web mail

The Web mail is an alternative archive. Several Web mail services offer good search possibilities (e.g., Gmail and Yahoo!) and then links and other information can be saved by mailing yourself or by mailing to a special archive mail address which is easily retrieved. This alternative is particularly useful when you’re not at your own computer, but borrowing a computer temporarily.

Screen dump

A screen dump is an image of everything or parts of what is shown on the screen. Commands:

Entire screen: shift +prtscr

Active window: alt+prtscr

Paste the screen dump into Word, PowerPoint or into an image processor. Or upload the screen dump to a blog or a Web word processor. Gadwin PrintScreen (www.gadwin.com/printscreen/) is a freeware. The program lets you choose several fi le formats and zooming is possible. Another alternative is Screen grab (an addition to the Web browser Firefox).

PDF writer

Instead of printing interesting material that you fi nd, or save entire Web sites, you can ”print” the documents to a PDF writer. A PDF fi le is then generated which is easy to store (and to retrieve with a desktop search program). If you don’t have Acrobat Writer there are free PDF writers to download. CutePDF (www.cutepdf.com/products/cutepdf/Writer.asp) is a free PDF writer.

Install the writer and print the Web pages as PDF fi les for future reference. Then let a desktop search service index the PDF archive. You can then search the PDF’s in full text – efficient information retrieval. If you want to print out something, a PDF writer works like a test printout. On the created PDF fi le you’ll see if everything is there and if the right parts are included in the printout. If you’re dealing with long Web pages you’ll even find out how many A4 pages it will be (so that you won’t be surprised when the printer spits out 27 pages) and then you can print out the pages of interest to you.

”Launder” text in a text editor / collect information in txt documents

You can cut out the text and ”launder” it in a text editor to save it as a text file (txt). Word-processing software like Word functions poorly, because they include too much of the text formatting, like font colour and text size.

I personally use the text editor UltraEdit both for laundering text and collecting information. In UltraEdit it’s easy to work with information in different subjects as this text editor works with tabs. Thanks to the tabs it’s easy to have many documents open and to shift between them.

Notebooks and word processors on the Net

Word processors and notebooks on the Net are other good ways for saving information, particularly if you need to be able to reach it from different computers. Google Docs is one example of a Web-based word processor and it allows several people to collaborate on the documents. Other features are connections to online publishing and blogging.

Save searches

Save the URL to the search engine’s hit list among your favourites so that you easily can go back and do the exact same search later.

Special programs

There is a type of programs called research managers. In these programs you can save Web pages, text excerpts, images and other things from the Web for later use of the material, for the creation of reports or for the use of the information in other programs. The saved material can be labelled with comments, subject words and other meta data. In some programs you can, supposedly, also save copies of entire Web sites. Several of the large programs have gone from being just programs to being Web services. eSnips (www.esnips.com) is one Web-based service.

Tools like eSnips will probably be getting increasingly more significance in the future when the utilization of the Web generally, and of information management tools specifically, will come to maturity. We will no longer be able to afford or have the time to find information which we will later lose again. The utilization will have to be rendered more efficient. What does it matter that everybody has broadband if nobody can save a text snippet in a sensible way?

Desktop search programs

To find your way among downloaded or created files you may have great use of a desktop search program. The program works like a local search engine; it creates an index of the content in your hard disk or in certain directories. Make sure that the program actually indexes the fi le types that are important to you. Also check if the program indexes other units than your ordinary hard disk (C:). This is especially important if you are connected to a network where you use files from different network units. At the time of writing the largest desktop search programs are the following:

Case study – desktop search

For a long time I had been looking for an old version of my thesis manuscript on my hard disk. But since the documents on the computer had turned into a muddle of folders and fi les from several old computers (private-, study- and work computers) I couldn’t find the fi le. I was after the title of my submitted thesis manuscript, but I didn’t know what the fi le name could be. But then suddenly one day it hit me: I had added the ISSN of the publication series on the title page! I could then search on that number in my desktop search program (Copernic Desktop Search) because it had indexed all the fi les on the computer. No sooner said than done. At the search the document showed up among the oldest ones in the hit list (and with an insignificant name). I had retrieved the sought title!

Index the content in Thunderbird

Are you using Mozilla Th underbird for your RSS feeds? You can then index the posts by means of, for example, Copernic Desktop Search. Priceless if you normally spend time looking for posts where you’ve read something important. Naturally the e-mail in both Th underbird and Outlook can be indexed.

