David Novak
2009-08-21 12:35:39 UTC
Archive-name: internet/info-research-faq/part2
Posting-Frequency: monthly
Last-modified: April 2002
URL: http://spireproject.com
Copyright: (c) 2001 David Novak
Maintainer: David Novak <***@spireproject.com>
Information Research FAQ (Part 2/6)
100 pages of search techniques, tactics and theory
by David Novak of the Spire Project (SpireProject.com)
Welcome. This FAQ addresses information literacy; the skills, tools and
theory of information research. Particular attention is paid to the
role of the internet as both a reservoir and gateway to information
resources.
The FAQ is written like a book, with a narrative and pictures. You have
found your way to part two, so do backtrack to the beginning. If you
are lost, this FAQ always resides as text at
http://spireproject.com/faq.txt and with pictures at
http://spireproject.com/faq.htm
This FAQ is an element of the Spire Project http://spireproject.com,
the primary free reference for information research and an important
resource for search assistance.
*** The Spire Project also includes a 3 hour public seminar titled
*** Exceptional Internet Research. This is a fast paced seminar
*** supported with a great deal of webbing, reaching to skills and
*** research concepts beyond the ground covered on our website and
*** this FAQ. http://spireproject.com/seminar.htm has a synopsis.
*** I am in Europe, seminaring in Ireland and Europe though I
*** will be returning to the US shortly, and South Australia for
*** a seminar this October.
Enjoy,
David Novak - ***@spireproject.com
The Spire Project : SpireProject.com and SpireProject.co.uk
NOTE FOR RETURN READERS: previously, we prepared this section by
converting work originally prepared in html. This became unproductive
so we have limited the internet links in this FAQ and direct you to the
more lengthy articles prepared in html. All the required links and
search tool forms reside in other parts of the Spire Project, like the
websites and free shareware
(http://spireproject.com/spire_latest_version.zip).
Searching Specific Formats.
Section 4
On the second year of his training, Shakh began to piece together the
many rules and guidelines to understanding hieroglyphs. He had thought
the lessons would end once he learned the glyphs but no, there were
long and convoluted rules governing the translation of sounds into
glyphs. Simple rules govern the placement of glyphs on the wall -
certain glyphs lose their meaning when placed apart.
Then, there was the art of writing. The glyphs had to be the right size
and shape. If you were about to finish the line, you could squish
certain glyphs just a little to make room for the next glyph. If you
did not plan well, you would leave the line hanging, a word unfinished,
a sentence incomplete.
Then Shakh started to learn hieratic - shorthand glyphs for less formal
situations.
It was all very complicated and cumbersome. Shakh did not like the
technical nature of writing. So much to learn and still so far from
writing clear, interesting results. His seasons in training went very
slowly. The Nile rose then fell then rose again.
- - - - - - - - - - - - - -
A great deal of dull information must be comprehended, absorbed,
internalized. Nothing spectacular. Nothing of particular interest. Just
a mass of rules and guidelines to help you move within the world of
information.
On the third year of medical school the aspiring doctor begins to
memorize a vast linked-array of drugs, symptoms and afflictions. The
next three years are spent developing this mental array; refining,
building, adding experience, so that one day a doctor may look at a
symptom, think of possible afflictions or drug reactions, then
proscribe drugs or call for further tests. The whole process of
learning this array is intensely dull.
In the first part of this FAQ we explained in detail how an information
search involves first selecting a suitable format (book, webpage, news,
interview ...) then searching a few important tools that help us find
information in that format. The first format we will look at is the
humble book.
Books
Links and forms at http://spireproject.com/books.htm
Shakh arrived in Edfu on a small boat in the company of his father. It
was a short walk from the dock to the Edfu temple complex. A fantastic
sight. A noble sight. The temple included a vast library of books and
manuscripts - a warehouse of knowledge about Egypt.
Not that there were many manuscripts in total. The time and expense it
took to create even a single copy made the library a prohibitive
expense open to only those in certain need. This was not a public
library, but an elitist library, open only to those who could justify
the gifts required to enter. There it was, open before them, long
shelves of scrolls arranged by rough topic. Amazing indeed. Shakh
shivered slightly in the cool air. This would be his life for the next
few years.
- - - - - - - - - - - - - -
Books have such meaning to us as a society. We have a vibrant emotional
connection. Books exude a solid proof of value to a larger community.
They are important resources but the additional awe is amazing to
behold. Try ripping a chapter from a book you own in public. The stares
and discomfort is almost tangible. Some book-lovers get upset about
slight creases in books, treating books as if they were important
museum quality manuscripts - something to hold with awe and treat
gently.
Being a book writer is similarly impressive. It is a mark of an expert.
A knowledgeable expert. A knowledgeable expert we should listen too,
should pay money for the chance to listen to, should pay, listen and
carefully not crease their work.
This attitude is silly.
A book is a package of information, prepared along certain guidelines,
with a purpose. In research we look for books on a topic that may help
us answer a question. These books tend to be large, lengthy, detailed,
verbose, heavy. Books are not good at describing cutting edge
developments. They generally summarize popular consensus. They avoid
criticism. When searching, they can make horrible resources.
Books are also large and physical creations. They must be stored. They
stick around. They have a limited shelf life but libraries are forever
over-stocked with dated publications of limited use and value. They are
also long - troublesome things to read.
Books come in different flavors. There are the books by industry
insiders who tell the truth, rip the facade about a particular
industry. Such books make brilliant resources. There are also books by
journalists, prepared without insider knowledge, more of a novel of a
newsworthy situation. Such books tend to the verbose, circumstantial,
light on facts.
Certain questions simply beg to be answered by reading a book. Such
questions are usually general, introductory, timeless. For such
questions a stack of news articles would lack cohesion. A collection of
articles would be too precise, not give you the larger picture. Such
questions need the 100 pages of description, pictures and the
considered framework that books embody.
Finding a Book
As an information format, there are certain tools and resources you
need to be aware of to effectively search for books. Thankfully, many
of these tools have emerged on the internet. These include:
- A database of the free books on the internet from projects like the
Online Book Initiative and Project Gutenberg. Includes many
copyright-free classics (but not ebooks - a different concept).
- Three government publication databases for the US, UK and Australia.
The US and Australian databases are comprehensive. The UK database is
incomplete. The complete database is commercially available
- The book databases of large online bookstores is incomplete but
useful as a fast search of current books. Some include background
information. I use Barnes & Noble, Amazon, Borders and the UK Internet
Bookshop (of the WHSmith bookstore chain).
- The largest libraries of the world, like the US Library of Congress
and British Library hold more than 20 million publications stretching
back many years. The online book catalogues are not good for the latest
books, but are brilliant at earlier works.
- Local libraries and state libraries are noteworthy as finding a book
in their database also means you have found access to these books.
- The definitive resource is the collection of national Books-in-Print
databases like [US] Books in Print, Australian Books in Print, French
Books in Print... These databases are commercially available online, as
print directories (yuck) in libraries and often from publicly available
to search from good bookstores
Book Databases
Information about new books is organized in a collection of national
"Books in Print" databases. This information is publisher-verified,
includes forthcoming titles, and is naturally updated far faster than
the library and bookstore catalogues.
Books in Print, produced by Bowker, delivers publisher-verified
information on US books. British Books in Print is produced by Whitaker
& Sons, delivers publisher-verified information on UK books. Further
national book indexes include Australian Books in Print (Thorpe),
Canadian Books in Print (University of Toronto Press), Les Livres
Disponibles/French Books in Print (Electre), Italian Books in Print,
German Books in Print and others.
All these directories are available as print directories (not
particularly convenient), as a commercial database (through database
retailers), for subscription (bookstores frequently subscribe) or
through Global Books in Print (through not really global, is a group of
book databases).
With regards to the print versions, there may be recent editions in
your state library but don't bother. The directory is not user-friendly
as you must page through each month's subject categories. A more
convenient alternative access point is your favorite large bookstore.
For about Au$4500/year, many bookstores subscribe to Global Books in
Print on CD-ROMs, or a national 'books in print' database. There should
be no cost for searching, but ask for the date and the database name so
you have a clearer idea of what is being searched.
Further Book Resources
Book Reviews are a viable tool in a book search. The tools mentioned
above will give you very little information indeed - mainly title,
author, format and price. You will usually want more than this before
you buy a book.
Book reviews are published in a range of book-related journals and
newspapers. These are compiled into a commercial database of Book
Reviews, like the Book Review Digest by H.W.Wilson or Book Review Index
by Gale Research, or individual book reviews from the like of the New
York Review of Books (http://www.nybooks.com/nyrev/). A state library
may provide access to the Book Review Digest Database.
Online book reviews are further discussed in Locating Book Reviews
(http://www.lib.monash.edu.au/hss/guides/fsreview.htm) by Monash
University Library.
Barnes & Noble, and to a lesser degree Amazon, have additional
information in their book database. Since it is free, it makes for a
fine immediate alternative to searching book reviews.
Future developments in book-related discussion groups holds out more
promise in harnessing the opinions of a book-reading public. Quality
issues remain (and the anonymous musings listed in Amazon.com and
Barnes & Noble
There are also book finding services with specialty book databases -
like a database of second-hand books. Books on Demand is a directory of
out-of print books available for reprinting (and includes price and
order information.)
Strategy
Obviously title searches are not effective tools to discover new books.
Not all books on Vincent Van Gogh include Vincent in the title. Subject
searches, work well only if you can grasp the indexing.
Apply these effective search techniques:
1) Browse the subject listing and select the subjects which interest
you.
2) Read the subject listings off a book you know interests you - then
search for other books in those subjects.
3) Search for other publications from suggestive authors (especially
when the author is an association).
Library catalogues, like LOCIS can illustrate these techniques. Let's
say a title or subject search lands you with one of the books listed in
LOCIS. This catalogue lists the applicable subject titles. Looking at
books placed in the same subject category works well.
A word about Book Types. Just as internet information comes in
different qualities and formats, books also come in different styles
and flavours. Books written by industry insiders are characterized by
personal stories and expert wisdom from an author telling all the
secrets. These books are worth looking for, and the short bio may give
a clue. Books written by Journalists have a different flavour, slightly
more newsy with less factual than, let say, Government books (far more
factual than most), and frequently updated books (far more current than
most). Try to find the style of book suited to your needs.
Information Theory
The book industry has reached a kind of plateau where fairly definitive
databases exist for listing books. There are databases for government
books, out-of-print books, second-hand books, current books. The
internet has changed some elements of this mix, as business models try
to support moving existing databases to free access, and others use
this change to try to present more definitive databases. Book reviews
have never properly been used by the book industry, so the big change
appears to be a move from book titles (as in most book databases and
library catalogues) to rich information (like Barnes & Noble) which
includes reviews and readers comments.
___________________________________________________
The Article
links and more at http://spireproject.com/article.htm
Articles hold a definitive value, a statement of quality and currency.
Sometimes articles are long, unique and informative works. Sometimes
articles are short, simple, trite; a rehash of common knowledge. There
is a range of ways to access articles - though none are particularly
inexpensive. We also have difficulties paying copyright - so most paid
research assistance is restricted to certain, more expensive tools. In
all, articles are cumbersome, cumbersome and time-consuming to work
with. They can also be brilliantly rewarding.
There are three difficulties with article searches:
1_ Finding the articles which interest us.
2_ Getting our hands on a copy. (Many articles you locate may be
impractical to access in person while electronic access can be
expensive.)
3_ Copyright permission, (which can be potentially simple or
exceedingly expensive).
Of course, the main stay of article research is photocopying an article
directly from a journal. Find a library nearby which holds the journal
then read or photocopy it then and there. This process can be improved
by using the online library catalogues (to see if they hold the
journal) and by searching a database of library holdings (often
available for free by asking or calling a librarian at your state
library). As you could expect, some commercial businesses will
undertake this work on your behalf, for a fee.
The difficulty with this process, of course, is this does not help you
discover what articles will interest you - this only works if you have
a useful bibliography to work from.
In recent years, a concerted effort has been made to bring you full
text articles electronically. Commercial databases in general have
moved from being strictly bibliographic to many full text articles. A
system of full text articles on CD-ROM has a brilliant future. Up to
500 journals are updated frequently in this inexpensive format. (Most
Research Libraries have this station.)
Some of the commercial full text databases have emerged online too.
Northern Light presents this. Unfortunately, the better quality
articles are not included in these databases. It is not an absolute
rule but to date, many of these commercial databases are filled with
regional business papers, newspapers or similar middle to low quality
publications.
There is another system for accessing articles, which comes to us from
a very long time ago. Inter-library loans are a system worked out
between libraries so articles can be exchanged between libraries.
Naturally you need the assistance of a library - and a great deal of
patience. Such requests can take over a month to arrive.
Lastly, there is always the option of direct purchase of periodicals
from the publisher.
Commercial Services
Carl Uncover service (fatback articles).
CARL (http://www.carl.org) is one of the great library groups in North
America established a service to provide articles by post or fax. Carl
promises to fax articles provided you use their system to check one of
their many libraries has the required document.
Northern Light - online database of articles
Northern Light (http://www.nlsearch.com) is a search engine of both the
web and their own database of articles available for purchase. The
rates are cheaper than Carl (up to $4.00 per downloaded document) and
the articles are delivered over the internet (not faxed) but the range
is smaller.
Information Theory
Many of the databases will begin to offer their services either as a
pay-per-view, or through reasonable direct subscription methods on the
internet. This has been predicted for years but depends on the
emergence of a fine way to purchase cheap items on the internet:
digital money. No effective digital money has emerged yet, and most
databases will either wait, or try one of the existing incomplete
methods. Essentially, critical mass has not yet arrived, and it now
appears that the true fall in price of information is waiting on an
effective digital money. In preparation, magazines and newspapers are
purchasing all the rights possible - especially the electronic rights.
More appears on this topic later.
___________________________________________________
Webpages
Links and forms at http://spireproject.com/webpage.htm
Webpages are often of unknown age, of only guessed at quality and
potentially the easiest information to retrieve. There are many points
of entry to web resources, but search tools differ. Try to match your
search tool to your question. To start, you will need to learn
something of the different tools - described below - and four basic
search techniques: Boolean, Proximity, Field Searches & Truncation.
Global Search Engines
Altavista (http://altavista.com) includes a very large, fast search
engine. It allows for Basic Boolean AND + NOT - OR | Proximity " " ~
(near - within 10 words of each other.) Several Fields: title:"Spire
Project" domain:gov url:edu link:cn.net.au and Truncation/Wildcard (*)
Of import, Capitals matter with Altavista.
All-the-Web (http://www.alltheweb.com) is important because it is large
- really large - with a flexible search facility. Allows Partial
Boolean + - Simple Proximity " " and Several Fields a title field
search normal.title:spire url field url.all:.au link text and link url
fields normal.atext:spire link.all:cn.net.au All-the-Web is not case
sensitive. The same database supporting All-the-Web supports Lycos.
Inktomi (via http://hotbot.lycos.com) provides its substantial web
directory through other companies, in this case, HotBot. also allows
searches by region, by date, and more.
Debriefing (http://www.debriefing.com) is our meta-search engine of
choice. Use this to find names & named websites. Accepts Partial
Boolean + - Simple Proximity " ". Capitals matter.
Google(http://www.google.com/) is a new style of search engine which
ranks sites with more care and concern. This works well for sites you
know a little about in advance. Unfortunately, has no useful field
searches. Allows Partial Boolean + - Simple Proximity " ".
Unfortunately, No Truncation not even for plurals!
When searching for a topic with precise descriptive terms, use a broad
search engines. Always place the Boolean +symbol before each search
word (like this: +word1 +word2) to insist all words appear in the
results. Quotes keep words together ("word1 word2"). These two simple
steps dramatically improve results. Keep adding words and search limits
until the number of hits is reasonable.
For more global search engines, there are numerous lists to consider
like the W3 Search Engines page at the University of Geneva
(http://cui.unige.ch/meta-index.html#INF) and the Industry Research
Desk (http://www.rbbi.com/links/sengine.htm).
Meta-Search Engines & Google
If you know something of the destination already, like a title or
company name or full name, try using a search tool that excels in
finding named websites. There should be little difficulty in finding
such sites with either Google or a Meta-Search engine, but don't get
excited and use these on other occasions.
Categorized Lists
When searching for information that lends itself to a particular
category or topic, start with resources which group information in
categories. With few exceptions, these resources index websites, not
webpages. Also, keep your search words simple as these are small
databases.
Yahoo (http://yahoo.com) is the largest of this type of directory tree;
the definitive site. Accepts Partial Boolean + - Simple Proximity " "
Truncation * and Several Field t: (for titles) u: (for urls) and a
date field through a form.
The Open Directory Project (http://dmoz.org) is a Netscape effort to,
presumably, mute the strength of Yahoo. It is very good, and very
similar to Yahoo.
Looksmart (http://www.looksmart.com) is another significant directory.
For an alternative, try the World Wide Web Virtual Library: Subject
Catalogue (http://vlib.org/Overview.html), a distributed network of
subject lists, not nearly as dominant as Yahoo, but far more
"scholarly" shall we say. This virtual directory has been around many
years, previously famous from www.w3.org.
Reviewed Sites
When seeking specific fields of study, when topics are clouded with
many similar, low quality sites, start with resources with a greater
degree of personal attention. Peer review and vetting produce resources
with more quality but limited coverage, better suited to this
situation. Also, keep your search words simple.
The Scout Report (http://wwwscout.cs.wisc.edu) is one of the oldest and
most highly regarded e-newsletters introducing new internet resources.
Residing at the University of Wisconsin, the Scout Report describes
research, education & topical sites. The Scout Report Signpost provides
a quick search of previously featured sites.
BUBL (http://www.bubl.ac.uk) is a British site which reviews internet
resources then indexes by Dewey decimal number. I prefer their Dewey
presentation but the collection is not large (though the largest of the
library projects I have seen).
The Argus Clearinghouse (http://www.clearinghouse.net) is a vast
collection of internet guidebooks. We can search the titles &
descriptions, but then click on the highlighted keywords to find
related guides. I suspect Argus is not successfully keeping pace with
internet development.
AlphaSearch (http://www.calvin.edu/library/searreso/internet/as/) is
similar to Argus. This one indexes important nexus sites and should be
browsed.
The Britannica.com (as in Encyclopedia Britannica
http://www.britannica.com) has been remolded as a free guide to books,
periodicals, web and their encyclopedia. This encyclopedia is perhaps
the best.
FAQs can be searched from an FAQ database like the one at
http://www.faqs.org
WebRings list sites by topic. Each webring is maintained by a volunteer
at an uninvolved site using standard software. The primary sites are
currently Webring.com and bomis.com
Specialty Tools
For issues with a particular government, url or language origin,
consider using tools designed with this in mind.
* Altavista can be limited to specific domains (gov edu au) with their
"domain:domainname" field search. "url:url-segment" is also useful.
Read the Altavista Fancy Features for Typical Searches.
* GovBot (http://ciir2.cs.umass.edu/Govbot/) as developed by The Center
for Intelligent Information Retrieval (CIIR) is a search engine which
indexes exclusively a great number of government webpages, a unique
resource.
* Altavista also allows for a field search by language. Searching for a
Japanese site? Consider searching only webpages in Japanese.
* Purely regional search engines may also be the answer. Aussie.com.au,
for example, is a search engine indexing only Australian websites.
There are fine lists of regional search engines and directories like
SearchEngineCollossus, Search Engines WorldWide, SearchEngineWatch and
Yahoo.
* Topic-specific search engines, a new arrival, has a very promising
future. Ideally you will find a search engine like ChemGuide
(http://www.fiz-chemie.de/en/datenbanken/chemguide/)covering over a
million chemistry related pages. Search Engine Guide
(http://searchengineguide.com) and Gary Price's Direct Search.
(gwis2.circ.gwu.edu/~gprice/direct.htm) list topical search engines.
* Lastly, there are some commercial databases aimed at the software and
internet industries. Consider OCLC's NetFirst (articles from magazines
describing the internet).
Conclusion
For many of us, searching the web is simply typing words into a search
engine. I hope I have shown there is more to it than this. What may not
be clearly evident from a brief overview of resources is that each
resource has a particular difference, a particular focus, a particular
angle that helps us answer certain questions faster than other tools
and searches.
Yes, in the simple world of Yahoo and Altavista you pay no attention to
the specific differences between alternatives - you are left with the
worst of these two tools. Your results are general, timeless and
imprecise.
Contrary to myth, global search engines are not the best place to start
most of the time - just some of the time. On other occasions, start
with a directory, a meta-search engine, a guide, an FAQ... We should be
able to identify which tools excel at locating what kinds of webpages.
(There is no simple search of everything.)
There are more insights into effective internet research. Information
clumps; Information is not established in isolation but instead
develops in context, is reinforced, and becomes a trend. The publishing
motivation & promotion purpose can help us rapidly judge the content of
a website. The webpage address can tell us a great deal about both the
website structure and the type of publisher.
Once skilled, you can segment and search the most promising areas of
the web quickly and efficiently. If you do not quickly find your
answers there may be other, more appropriate resources. Consider asking
for help in an appropriate discussion group, or reviewing printed
literature instead. The Web is only one resource among many.
If your primary interest is Search Engines, consider reading A Higher
Signal - To - Noise Ratio
(http://www.dpi.state.wi.us/dpi/dlcl/lbstat/search1.html) by Bob Bocher
& Kay Ihlenfeldt, Sink or Swim: Internet Search Tools & Techniques
(http://www.lboro.ac.uk/info/training/finding/sink.htm) by Ross Tyner
and The Search is Over
(http://www.zdnet.com/pccomp/features/fea1096/sub2.html) by Adam Page.
For even more, read Searching the Internet
(http://wwwscout.cs.wisc.edu/toolkit/searching/) a publication in the
Scout Toolkit and browse Search Engine Watch.
Strategy
Searching the web is more skill than most of us acknowledge. The web is
a manifestation of the demon professional researcher's work with all
the time in the commercial information market. There is constantly the
fear you have missed that single important site with everything.
Consider the researcher's motto:
Someone, somewhere, probably knows the answer.
But how long do we search for gems, and where do we look? To decide, we
must learn about internet structure and organization. Why is
information published on the web? Why is it promoted? Let's review the
reasoning behind effective internet research. There is so much more
than putting words into search engines.
#1 Motivation
We can make some very astute generalizations about a webpage very
quickly if we can judge the reason it was published. Not only is this
an important step in analyzing any information, but this tells us a
great deal about the contents of the webpage.
Yes, merely determining a site belongs to an association actually
specifies the quality, motivation and type of information we will find.
Associations either publish what is termed 'brochureware' (promotional
material), or if well advanced, present research work previously
restricted to the association library: important research studies & the
like. Commercial interests have much more difficulty delivering useful
resources. The importance of projecting a corporate image comes first
(lots of 'brochureware'), and service descriptions come second. On
occasion, commercial interests will support a worthwhile service tied
closely to their own service - thus banks present interest rates -
bookstores present their book database.
The certainty with which we can make these judgments will astound you.
Corporate websites never publish "changes to patent law". They simply
don't have the motivation. Only an individual would publish this, most
likely not on the web but though a mailing list.
Information is not distributed randomly. Consider Format, Preparation,
Motivation and Promotion. Consider this, then Visualize the information
you seek.
#2 Promotion
We can make further snap judgments about web information from the way
you get there. Promotion is very difficult on the web, and it is hard
to find poorly promoted information. The tools you use to reach
information pre-determines the type and quality of information you will
find.
Search engines index webpages indiscriminately. Advertised websites
must have a pay-off. Directories focus on established websites (not
webpages). Link pages also link to established websites but put more
thought into the selection of resources. Both usually focus on general
sites. For specific or current resources, we need to move to mailing
lists or active nexus point.
Yes, when we find a webpage through the Scout Report (a prominent
resource discovery newsletter), we can assume the webpage has a high
quality of information, is reasonably current and has a general appeal
(within the interest of the newsletter readers).
Let's put this in reverse. If we are looking for a recent document by a
prominent library committee, we will not find it through Altavista,
Yahoo, or normal link pages (except accidentally). We may find it
through specialist newsletters, active nexus points, or through mailing
lists.
#3 Visualize
When an artist begins to paint, they visualize the image. They already
have a concept of the finished result. Internet research is no
different. We start by building a vision of the information we seek.
Who would publish it. What is their motivation? Who would promote it?
Where would I find it?
Information Clumps. Information is created, nurtured, develops, gets
transplanted, gets arranged and becomes visible through a process which
brings similar information together. Your understanding of this
process, including motivation and promotion, must guide your search of
the web. Only then will we know where to look, and quickly know if the
answers are on the web.
___________________________________________________
News
links and more at http://spireproject.com/newswire.htm
Shakh was invited to travel with the army on the conquest of Nubia. The
Egyptian army was not in need of further soldiers but there was a need
for a witness. Shakh would write the official chronicles of the army's
exploits. He would be expected to send a simple diary on papyrus back
to the palace and then to compose numerous descriptions for memorial
walls. He may also be consulted for paintings on the pharaohs tomb. It
was a fine offer, and he relished in the prospect of increasing his
value exposure.
The war was not swift, nor was it entirely one-sided. In the end,
superior numbers had its effect and Nubia was once again reunited with
Greater Egypt. Reporting was initially a challenge, since very little
happened from day to day. Slowly, Shakh got a handle on the process and
focussed on the grandness of the venture. Two years after floating up
stream, Shakh was able to do his finest work, the parade of captured
soldiers past the Pharaoh's representative.
- - - - - - - - - - - - - -
News articles are typically light and biased. Do not believe a news
item is a great critical analysis of current events. Most news is
produced under time restrictions, for prompt consumption. In research,
news often proves particularly useful for locating information about
individuals or businesses. News is also critical in creating a timeline
of events, in recording events of regional/national/international
importance.
News prepared by individual reporters is collected together by large
news organizations, then delivered to other news organizations around
the world. Your local news organization does not have a reporter in
Iran, but rather buys the story off a newswire, then packages it in
your evening news hour or morning newspaper.
You have probably heard of: United Press International (UPI), Reuters
Global News, Agence France Presse, Associated Press and Xinhua Chinese
Newswire. These very large organizations make their information
available to you in a variety of ways. News collects in commercial
databases of past news, some single source, others, large multi-source
databases. Current news is also packaged into large multi-source
systems delivered by email or newsgroups. Many newswires are available
online free of charge.
Free News
Critical to the changes on the internet is the emergence of free access
to text news. Individual newspapers present news free. Newswires
present news free. News sections to larger sites like Yahoo present
news from many sources, free. News-only search engines will help you
find information from a great many sites with news.
The process of finding current news is about as slick as imaginable.
Here are a few players in the market:
* Yahoo News (www.yahoo.com/headlines/) is leading this field with web
delivery of current news from Reuters, Associated Press, and others.
Yahoo also includes a free search for one week's news.
* Voice of America Newswire (VoA and now voanews.com) delivers news in
English & many other languages.
* The Washington Post (www.washingtonpost.com) offers their own current
news for searching, as well as the Associated Press wire, each searched
separately for the past week.
* Fox News (www.foxnews.com) presents current news online (both current
events and sport news). CNN news (www.cnn.com) is another searchable
site. Both repackage some newswires and present them online. C|news
(www.news.com) does this too.
* Newsbytes (www.newsbytes.com) is a newswire solely on computer
topics, computer, telecom and online world. InternetWire and other
specialty newswires also present news from their website.
* United Nations Radio: The World in Review is one of many news shows
with the transcripts online. Unusually, the Vatican's newswire is not
free online.
* Obviously many more exist - and thankfully we don't need to create a
list or manage the sources. The Spire Project has a clickable map of
English language newspapers. There are definitive lists of global
newspapers like Gary Price's
http://gwis2.circ.gwu.edu/~gprice/newscenter.htm#International
http://dailyearth.com and http://ipl.org/reading/news/
Commercial Resources
The commercial segment of the news market is obviously being squeezed
by the copious quantities of free news online. There are, however,
still some viable markets, principally enterprise solutions (companies
are willing to pay for slight improvements), past database access, and
surprisingly the Wall Street Journal (US$49/yr).
To these markets we have Clarinet and Newspage. World News Connection
is US Government service presenting translated news (quite a gem) as a
searchable database. Unusually, prices start at US$25/7days - yes one
price for the news!
Of course news alerts can be arranged from the commercial news
databases through the database retailers, and each newswire like Agence
France Newswire, Canada Newswire, Xinhua News and Associated Press all
are unique databases, and all stretch back many years. Further
databases like Newswire ASAP and what used to Global Textline are
massive databases of multiple newswires and newspapers. I recall at one
stage Textline had over 4 billion pages.
Conclusion
News articles are typically light and biased. The sheer quantity of
news in the large news databases make this a useful resource to fall
back for any tightly focused research topic. I once discovered an
obscure scientist working in a unique field from a small 3 paragraph
article in a local farmer's newspaper in England (Global Textline
Database).
Newswires and News Databases are just two elements of a large industry
which extends to the your local newspaper and to further specialty
databases. Most newspapers maintain their own local news database, and
some make this available electronically. A manual clipping services may
also be the option - certain firms manually page through local papers
looking for advertisements or articles.
While on the topic, certain newswires like Business Wire and PR
Newswire essentially distribute certain types of news for money. Yes,
anything in these newswires is there because the company paid for it to
be there - $500 and up most likely. Other newswires earn money in the
reverse process: from the media who read or publish their work.
Associated Press or Reuters are created from news organizations. Others
like Voice of America (VOA) are alternatively funded, but with
reasonable reliability.
There are also a range of focused newswires such as Newsbyte (computer
issues), PR Newswire (product releases), and Middle Eastern newswires.
Further newswires can be found at Yahoo.
Strategy
I can think of four ways to use this information for research:
1) As an alternative to your evening news or morning newspaper. Online
news is available 24 hours a day, in more detail, from respected news
organizations.
2) Search past news to locate information unlikely to emerge in
journals or magazines. News includes a great deal of local detail and
personal information unlikely to be found elsewhere.
3) As a historical record of events, perhaps the basis of a timeline.
4) Current Awareness and Alerts so articles come to you as they are
reported. News stories by email will become a large industry over the
next two years.
Information Theory
Just how inexpensive can news become? US$25 gets you access to past
translated news! VoaNews.com keeps a searchable directory back a month
for free. Many newspapers still have extensive archives of news, though
they hope to one-day charge for them. In a way, no-one is making money
from news. It is only worth the advertising revenue for distracting you
from reading the news - and that is falling too. With the freedom of
moving information through the internet, several free services will
send you email when an news article matches your interests (an Alert).
The future will see much more "compile your own" newspaper - especially
since it could conceivably be compiled at minimal to no expense
depending on the technology (frames anyone?) An intriguing lawsuit
recently stopped TotalNews (a news only search engine) from displaying
news articles in a frame.
If allowed to speculate for a moment, News-for-Pay may also become a
viable businesses. Perhaps this is just being cynical of journalistic
standards and the accepted standards of promotion. Perhaps it is also
recognition that Businesswire and PRWire are just two of several
newswires where you pay to have your news included. Obviously news
today is biased towards advertisers (through advertorials) and
promoters. Perhaps this will become automated some day - like Yahoo's
"we will look at your site right away for $200".
Naturally, the links and many of the forms to news resources discussed
here can be found at http://spireproject.com/newswire.htm and also our
All-in-one page: http://spireproject.com/spir.htm
___________________________________________________
Theses and Dissertations
links and more at http://spireproject.com/discuss.htm
Theses and dissertations are professional papers completed for higher
degrees. That is to say, they are long, dense and often very esoteric
and convoluted. Trouble is, most theses and dissertations have no more
than 12 copies ever - one always to the University Library, one with
the author, but others scatter to the wind.
All University Libraries hold a copy of past theses undertaken at their
university. This gives rise to the unfortunate but necessary pastime of
searching each local university library for relevant theses. The
advantage here is masters and occasionally honours theses are indexed.
Most often, just undertake a keyword search then add "thes*"
(truncation of theses or thesis).
Electronic Theses Databases:
Dissertation Abstracts Online, produced by UMI, delivers abstracts to
most every doctoral dissertation/thesis in North America, some master's
theses and some international theses. This is the definitive site to
search, though you will need the help of your library to see more than
the abstract. Some libraries will have subscribed to Dissertations
Abstracts OnDisc - the CD-version of this database.
The [British] Index to Theses with Abstracts is a print directory by
ASLIB. This publication is also available as a database, available for
site licenses through Theses.com (www.theses.com). This source is quite
comprehensive as can be seen with the University List.
Several other national databases do exist. Here in Australia, a list of
theses was maintained from 1966 to 1991. The Gale Directory of
Databases also lists THESA, a database of French theses, and
Dissertations and Theses of the ROC (Taiwan).
The Australian Education Index (1978+), produced by ACER (Australian
Council for Educational Research), is a directory listing citations and
some abstracts to Australian work in education. Also available as a
commercial database, AEI is bundled into Austrom, a common collection
of Australian databases.
Digital Archives of Theses
In theory, some theses should be available on the internet,
particularly theses lodged electronically. There is a push for
universities to accept electronic thesis submission, and to build
digital archives of theses. The embryonic National Digital Library of
Theses and Dissertations (NDTLD - www.theses.org) is just one such a
project. There is a distributed and sequential keyword search to
participating universities through its not particularly functional. In
theory, this is an incremental improvement to searching library
catalogues.
Conclusion
Getting a thesis can be very difficult. You will need the help of a
document delivery through a library and many theses will not be
available to borrow. You can also buy theses. Read Obtaining Copies of
Dissertations (http://www.library.yale.edu/ref/err/disscops.htm) by
Yale University Library for more. For an alternative look at theses,
consider Locating Theses
(http://www.lib.monash.edu.au/hss/guides/fstheses.htm) by the Monash
University Library.
A note on developments in this field: some Theses abstracts are
emerging online already. Projects like the LA Theses Database
(Landscape Architecture Theses Archive) have much promise but poor
coverage. Full text theses presentation also have promise with the US
Department of Education funding a National Digital Library of Theses
and Dissertations and Virginia Tech starting to request electronic
submission of all theses.
UMI (the producers of Dissertation Abstracts Online) has backed this
move with a direct delivery service of electronic theses to US
libraries for $26, but only theses held in their digital archives are
available. Eventually, large digital Theses archives will be the norm,
but until then, very little will happen in this field.
A thesis is a tightly constrained information package, produced in the
university environment with limited appeal. For economic reasons, we
should not be surprised theses databases are incomplete. The emergence
of theses archives sounds interesting - a good use of the internet -
but does not represent a financial opportunity that could be explored
without government assistance. Consequently, this small area of the
information sphere is government grant-driven.
___________________________________________________
Patents
links and more at http://spireproject.com/discuss.htm
A patent discloses certain facts about a commercially important
invention in exchange for certain rights to exploit the invention. This
is a little simplistic, but explains why patents are factual, unique
from other research resources, and a little vague in certain specifics.
If you have never seen a patent before, see a sample US patent ,
Australian patent, and this brief description
(http://www.ipaustralia.gov.au/patents/P_home.htm).
There are three primary resources involved in patent research. Firstly,
we have the free internet resources. Secondly, we have the national
patent agency resources. Thirdly, we have the commercial patent
databases.
Free Patent Databases
The concept of free patent databases has surely come, and while many
countries are only slowly moving this direction, the movement is
inevitable.
* The US Patent and Trademark Office (USPTO) provides a US Patent
Bibliographic database at patents.uspto.gov with full use of fields,
date and abstract text searching. Choose between their Boolean search,
advanced (field) search or by US patent number. They also maintain a
fulltext [US] Aids Patent Database and other resources.
* The IBM's Patent Server is a public service providing a different
patent database of US Patent abstracts. The IBM service is similar but
different from the USPTO service - certainly not less powerful.
* The Canadian Intellectual Property Office (CIPO) maintains the
Canadian Patent Fulltext Database from '89. This database is on par
with the US Patent Database, with perhaps even better searching
technology.
* The Japanese Patent Office (www.jpo-miti.go.jp) has a searchable
database of Japanese patent abstracts, including patent number, title,
inventor, company, and abstract of the patent.
Patent Authority Services
Patent libraries are an important and cost-effective patent resource.
* IP Australia (www.ipaustralia.gov.au) (formerly the Australian
Industrial Property Organisation (AIPO)) has a patent library in each
Australian state capital. Each library provides free access to the APAS
database (Australian Patent Abstract Search) and includes a complete
microfiche copy of all Australian patents and the Australian Official
Journal of Patents, Trademarks & Designs (the official Australian
patent gazette).
Most offices also hold US Patents on microfiche! Staff will help you
use the APAS database, arranged for free text searching by
International Patent Classification. A particularly useful service by
IP Australia is the delivery of copies of many foreign patents for
AU$15. You will need the patent number, country and title for this.
* The US Patent and Trade Mark Organization (USPTO) has the Patent and
Trademark Depository Library Program (PTDL's) placing the CASSIS
database (The USPTO patent abstract database on CD-ROM) and US patents
around the US.
The US patent libraries also hold the Official Gazette of the U.S.
Patent and Trademark Office, The official US patent gazette.
Importantly, the gazette is fully online and searchable from 1995.
* The [UK] Patent Office (www.patent.gov.uk) provides for the Patents
Information Network (PIN) which hosts patent information in the UK. The
British Library is just one listed source of UK patents (further
information online) and delivers some patent services.
* The Canadian Intellectual Property Office (CIPO) (cipo.gc.ca)
produces the Canadian Patent Index (CPI). They also publish The Patent
Office Record, Canada's official patent gazette.
* There are many more national & international patent organizations
like Intitut National de la Propriete Industrielle [France], World
Intellectual Property Organization (WIPO) and European Patent Office.
Thankfully there are fine lists of patent libraries and patent
websites.
Commercial Patent Services
One of the most invaluable resources in serious patent research is
access to several of the very large commercial patent databases.
* Lexis-Nexis (www.lexis-nexis.com) retails several patent databases.
Thanks to Patscan (University of British Columbia), we also a guide to
searching patents on Lexis-Nexis.
* The Dialog Corporation (www.dialog.com) retails a collection of
patent databases including: Derwent World Patents Index, Inpadoc,
Claims/U.S. Patents and European Patents FullText.
* CASSIS is the USPTO database. For a little more information on this,
consider the Patent Guide to Using CASSIS, at the University of
Michigan.
* Derwent Scientific and Patent Information (www.derwent.co.uk) is a
prominent publisher of Patent and scientific information including
commercial databases.
* Questel-Orbit (www.questel.orbit.com) also retails patent databases.
* CAS/STN (www.cas.org) retails a collection of patent databases
including Chemical Patents Plus for U.S. Chemical patents.
In addition to the database retailers and producers, there is a lively
industry of patent services.
* The Patent Libraries will assist you with some services. IP
Australia, for example, will retrieve most full patents from other
countries for AU$15.
Conclusion
Until recently, the legal profession has had a complete monopoly on
patent work. As you can see, this need no longer be the case. Casual
researchers will find the free patent databases easy to use, and more
experienced researchers should not be dissuaded from searching the
commercial databases or patent libraries themselves. The very large
commercial databases, like Inpadoc, are particularly easy to use.
Of course, there are occasions when patent searches are critical, and
experts should be sought. Certainly legal assistance is required if you
are preparing to lodge your own patent, but patent data as a source of
information is another matter.
As an industry, patent research is still deeply entrenched in the
high-price commercial database and database-centered services. I am
mildly surprised the emergence of free databases like the USPTO's
patent database has not led to a fall in the costs of the high-end
databases (which remain some of the most expensive databases in
publicly accessible). It appears this industry, as indeed several
others, has no intent to drop the price of retail database access to a
more supportable level. I can only predict this rests on economic
grounds. Patent information purchases are price insensitive.
___________________________________________________
Statistics
links and more at http://spireproject.com/stats.htm
Statistics allow us to lie with confidence. Dense and factual,
carefully interpreted statistics are also far more reliable than
personal experience. The expense of collecting meaningful statistics
limits the types of organizations involved in this work. This divide is
also a very elegant way to divide this field.
#1 National Statistical Agencies,
#2 Government Agency Statistics,
#3 Commercial Statistics,
#4 Association Statistics.
Statistical Directories
Statistical Abstracts (statistical bibliographies and statistical
directories) describe sources of statistics.
Instat publishes "International Statistics Sources: subject guide to
Sources of International Comparative Statistics" but I found this less
than brilliant. A better link is Statistical Sources (by Gale
Research), a basic and very large statistical abstracts directory.
On the internet, US government statistics are well recorded in
Statistical Abstract of the United States 1999
(http://www.census.gov/stat_abstract) a 1000+ page document made
available online in pdf format by the US Census Bureau.
Statistical Venues
Many statistics appear regularly in journals, annual reports and
newspapers. Specialty libraries, particularly specialty librarians, may
be aware of additional statistics.
If an expert goes through the effort to collect statistics, you are far
more likely to locate them by undertaking an article search, (looking
particularly for journal articles) and a book search. In both cases,
limit your search to only the last couple of years or you will locate
very old, dated statistics. A particularly sophisticated approach could
be to ask BusLib-l (Business Librarians' Electronic Discussion List)
since this is a mailing list of librarians. Use this resource
sparingly, and only after having exhausted other avenues.
National Statistical Agencies
Most every country in the world has a single government agency
dedicated to collecting, collating and publishing national statistics.
Statistics Canada, Australian Bureau of Statistics, The US Census
Bureau, The (UK) Office for National Statistics; we have a fine page on
national statistical agencies (http://spireproject.com/bureau.htm).
These organizations manage the census, watch the movement of money and
goods in and out of the country, and undertake a wide range of other
surveys. Finding these statistics is relatively straight forward, with
several directories on the internet.
Government Agency Statistics
Most government agencies collect reams of data on the industries they
monitor. Sometimes these statistics are published, sometimes you have
to ask for them, only rarely are they considered private or
unavailable.
Here in Western Australia, the government departments for Tourism,
Labour, Small Business and Big Business all publish top-rate statistics
free to interested parties. Our Dept of Tourism keeps a directory of
future tourism related projects.
When government statistics are bound and published, try the government
book databases. Remember MOCAT, AGIP and part of UKOP are free online.
Again, some US government statistics are well recorded in Statistical
Abstract of the United States 1999 by the US Census Bureau, online in
pdf format.
Association Statistics
Valuable statistics only come from motivated sources, and associations
are certainly motivated. Start with a list of likely associations, then
call up and either explain you needs or ask for their price list for
publications and statistics. For AU$25, the Australian Booksellers
Association publishes a brilliant analysis of the book industry.
Association statistics are financially informative, as the intended
audience is association members.
Commercial Statistics
Statistics created for sale are frequent in the financial sector but
exist in a number of further situations. Banks use more professionally
prepared market reports such as reports by the Australian economic
consultancy firm Syntec Economic Services, Guide to Growth, which
examines Australian industries financially with forecasts. IBIS
(www.ibis.com), another economic consultancy, also publishes to this
market.
Professionally prepared market reports are also emerging, with the full
text immediately from the commercial information market. Each database
retailer has several such databases, but often these databases are
focused globally or in a different country. Sheila Webber
(http://www.dis.strath.ac.uk/people/sheila) has a very good list of
firms which market research reports.
Conclusion
Central to the Internet Revolution is the liberation of just this kind
of information. Increasingly, we will see the publishing of such
documents on the internet, but for the few statistics currently online,
there is no effective search. You can only browse government websites.
Away from the internet, you must either contact the agencies directly
(in the hope they do collect statistics), look at the statistical
directories or seek agency statistics in other documents: books,
pamphlets, newsletters.
Once you have proceeded this far, it is wise to stop looking for
statistics, and begin again at sophisticated commentary - which is
likely to include supporting statistics or references to statistics
anyway. Seek expert guidance from others who would know of hard-to-find
statistics.
One approach to finding statistics is to reverse the process. Who would
prepare the statistic? Statistics are created in a logical manner, in a
very expected manner. Tourism statistics? - most likely undertaken by
either the government tourism authority, a tourism association or the
national statistical agency. There are few others who could even
consider preparing tourism statistics. If you can think through the
preparation process, you can usually identify who would have created
the statistic. (Internet statistics are the exception - too many
organizations are creating statistics of worth.)
Let's move on to specific fields of statistics.
National Statistical Bureau
The Spire Project has a fine html article on the National Statistical
Agencies (http://spireproject.com/bureau.htm). Australia
(www.abs.gov.au), United Kingdom (www.ons.gov.uk), Canada
(www.statcan.ca) and United States (www.census.gov) all have national
statistical agencies. Each organization collects and publishes
statistics on many facets of their respective countries. This article
should simplify your work in searching, selecting and appraising these
sources.
Each statistical agency organizes their statistics in a distinct way.
The Australian Bureau of Statistics (ABS) has an annual Catalogue of
Publications but also a search function, specialized statistical
category guides and several periodicals on new resources. The UK Office
for National Statistics (ONS) has a statistical overview, product
catalog and a search. The US Census Bureau has a collection of very
large publication catalogues, directories and periodicals. Statistics
Canada has several searches, publications and a catalogue
The two further elements to the statistical agencies are the
statistical libraries and the unreported commercial statistics. The ABS
has a dedicated statistical library within each Australian state, and
collections of ABS documents within most public and school libraries.
While the ABS documents within libraries are limited, the ABS libraries
are very detailed with most every publication they create available for
review. This is standard throughout the world.
While publications are sold by each statistical agency, and the
publication catalogues are available online, each agency has data they
sell in other formats. CD-ROMs of popular geographical and statistical
distribution have become very popular, as have small area population
statistics. Some of these services are packaged and sold for specific
purposes, like 4-site by the ABS used in describing business locations.
Even further, statistics can be generated specific to your needs. This
might include ABS import and export statistics for specific
commodities, or specific results from any of their surveys.
Lastly, Usinfostore.com presents a collection of economic indicators as
time-series data. The statistics originate from several government
agencies and is best considered as a value-added service: an intriguing
beneficial trend?
National Statistical Agencies are certainly not the only source of
statistics. They are, however, some of the easiest to access. These
agencies also have several traits that distinguish them from other
information sources.
Firstly, these agencies are legally required to disguise their
statistics to protect the identity of specific businesses and
individuals (with the exception of the Business Register). If there is
only one or two timber exporters in Western Australia, the ABS will not
give you timber exports from Western Australia. Specifics are found in
directories like Kompass, commercial databases, or insider information
(experts and articles by experts).
Secondly, national statistical agencies have a tendency to be old. Most
surveys are not completed annually, but rather every two, three or more
years. Census data is older still. The analysis process also adds a
delay. The ABS tends to take a year or more to collate and analyze
statistics. For Legal and Accounting Services Australia we have '92-'93
statistics, and the '95-96 statistics are due to be released early Nov
1997. Certain statistics like National Indicators are rapidly produced,
but most are not.
Thirdly, national statistical agency publications are detailed - far
more than most statistical publications. Commercial statistical sources
often neglect supporting information like sample size and demographic
breakdown, but expect these publications to include this and more.
Publications may still require further analysis, and may occasionally
come from inferior sources of information, but they are professionally
delivered.
There are several ways to search each agency: (1)
Each agency has thoughtfully provided their catalogue of publications
online. The links are above.
(2) Each agency collects certain information for analysis. It is
helpful to become familiar with the various surveys and information
sources used by each agency.
Besides the Census, the ABS conducts surveys of weekly household
expenditure, agricultural land-use surveys, R&D surveys, and periodic
surveys of various segments of the economy (like Legal and Accounting
Services, Australia 1992-93). They also collect landing cards (tourism
information), export and import documentation, regional hotel occupancy
rates and more. Each statistical agency is similar.
If the Australian Bureau of Statistics (ABS) has not yet conducted a
survey of hospital occupancy, they will not have this information.
(3) Agencies publish guides to information on a particular topic. They
also publish various newsletters of recent releases and annual
yearbooks too.
National Statistical Agencies are not the only statistics, nor
particularly the best. They are, however, often the best source for
demographic data, widely used by government and frequently re-published
in other government documents. These agencies also provide a range of
sample and national summary data directly from their website. Online
statistics have not yet been organized, so I rather expect browsing the
website for free information will be unwise, unless you are looking for
simple national data.
___________________________________________________
This document continues as Part 3/6
___________________________________________________
Copyright (c) 1998-2001 by David Novak, all rights reserved. This FAQ
may be posted to any USENET newsgroup, on-line service, website, or BBS
as long as it is posted unaltered in its entirety including this
copyright statement. This FAQ may not be included in commercial
collections or compilations without express permission from the author.
Please send permission requests to ***@spireproject.com
Posting-Frequency: monthly
Last-modified: April 2002
URL: http://spireproject.com
Copyright: (c) 2001 David Novak
Maintainer: David Novak <***@spireproject.com>
Information Research FAQ (Part 2/6)
100 pages of search techniques, tactics and theory
by David Novak of the Spire Project (SpireProject.com)
Welcome. This FAQ addresses information literacy; the skills, tools and
theory of information research. Particular attention is paid to the
role of the internet as both a reservoir and gateway to information
resources.
The FAQ is written like a book, with a narrative and pictures. You have
found your way to part two, so do backtrack to the beginning. If you
are lost, this FAQ always resides as text at
http://spireproject.com/faq.txt and with pictures at
http://spireproject.com/faq.htm
This FAQ is an element of the Spire Project http://spireproject.com,
the primary free reference for information research and an important
resource for search assistance.
*** The Spire Project also includes a 3 hour public seminar titled
*** Exceptional Internet Research. This is a fast paced seminar
*** supported with a great deal of webbing, reaching to skills and
*** research concepts beyond the ground covered on our website and
*** this FAQ. http://spireproject.com/seminar.htm has a synopsis.
*** I am in Europe, seminaring in Ireland and Europe though I
*** will be returning to the US shortly, and South Australia for
*** a seminar this October.
Enjoy,
David Novak - ***@spireproject.com
The Spire Project : SpireProject.com and SpireProject.co.uk
NOTE FOR RETURN READERS: previously, we prepared this section by
converting work originally prepared in html. This became unproductive
so we have limited the internet links in this FAQ and direct you to the
more lengthy articles prepared in html. All the required links and
search tool forms reside in other parts of the Spire Project, like the
websites and free shareware
(http://spireproject.com/spire_latest_version.zip).
Searching Specific Formats.
Section 4
On the second year of his training, Shakh began to piece together the
many rules and guidelines to understanding hieroglyphs. He had thought
the lessons would end once he learned the glyphs but no, there were
long and convoluted rules governing the translation of sounds into
glyphs. Simple rules govern the placement of glyphs on the wall -
certain glyphs lose their meaning when placed apart.
Then, there was the art of writing. The glyphs had to be the right size
and shape. If you were about to finish the line, you could squish
certain glyphs just a little to make room for the next glyph. If you
did not plan well, you would leave the line hanging, a word unfinished,
a sentence incomplete.
Then Shakh started to learn hieratic - shorthand glyphs for less formal
situations.
It was all very complicated and cumbersome. Shakh did not like the
technical nature of writing. So much to learn and still so far from
writing clear, interesting results. His seasons in training went very
slowly. The Nile rose then fell then rose again.
- - - - - - - - - - - - - -
A great deal of dull information must be comprehended, absorbed,
internalized. Nothing spectacular. Nothing of particular interest. Just
a mass of rules and guidelines to help you move within the world of
information.
On the third year of medical school the aspiring doctor begins to
memorize a vast linked-array of drugs, symptoms and afflictions. The
next three years are spent developing this mental array; refining,
building, adding experience, so that one day a doctor may look at a
symptom, think of possible afflictions or drug reactions, then
proscribe drugs or call for further tests. The whole process of
learning this array is intensely dull.
In the first part of this FAQ we explained in detail how an information
search involves first selecting a suitable format (book, webpage, news,
interview ...) then searching a few important tools that help us find
information in that format. The first format we will look at is the
humble book.
Books
Links and forms at http://spireproject.com/books.htm
Shakh arrived in Edfu on a small boat in the company of his father. It
was a short walk from the dock to the Edfu temple complex. A fantastic
sight. A noble sight. The temple included a vast library of books and
manuscripts - a warehouse of knowledge about Egypt.
Not that there were many manuscripts in total. The time and expense it
took to create even a single copy made the library a prohibitive
expense open to only those in certain need. This was not a public
library, but an elitist library, open only to those who could justify
the gifts required to enter. There it was, open before them, long
shelves of scrolls arranged by rough topic. Amazing indeed. Shakh
shivered slightly in the cool air. This would be his life for the next
few years.
- - - - - - - - - - - - - -
Books have such meaning to us as a society. We have a vibrant emotional
connection. Books exude a solid proof of value to a larger community.
They are important resources but the additional awe is amazing to
behold. Try ripping a chapter from a book you own in public. The stares
and discomfort is almost tangible. Some book-lovers get upset about
slight creases in books, treating books as if they were important
museum quality manuscripts - something to hold with awe and treat
gently.
Being a book writer is similarly impressive. It is a mark of an expert.
A knowledgeable expert. A knowledgeable expert we should listen too,
should pay money for the chance to listen to, should pay, listen and
carefully not crease their work.
This attitude is silly.
A book is a package of information, prepared along certain guidelines,
with a purpose. In research we look for books on a topic that may help
us answer a question. These books tend to be large, lengthy, detailed,
verbose, heavy. Books are not good at describing cutting edge
developments. They generally summarize popular consensus. They avoid
criticism. When searching, they can make horrible resources.
Books are also large and physical creations. They must be stored. They
stick around. They have a limited shelf life but libraries are forever
over-stocked with dated publications of limited use and value. They are
also long - troublesome things to read.
Books come in different flavors. There are the books by industry
insiders who tell the truth, rip the facade about a particular
industry. Such books make brilliant resources. There are also books by
journalists, prepared without insider knowledge, more of a novel of a
newsworthy situation. Such books tend to the verbose, circumstantial,
light on facts.
Certain questions simply beg to be answered by reading a book. Such
questions are usually general, introductory, timeless. For such
questions a stack of news articles would lack cohesion. A collection of
articles would be too precise, not give you the larger picture. Such
questions need the 100 pages of description, pictures and the
considered framework that books embody.
Finding a Book
As an information format, there are certain tools and resources you
need to be aware of to effectively search for books. Thankfully, many
of these tools have emerged on the internet. These include:
- A database of the free books on the internet from projects like the
Online Book Initiative and Project Gutenberg. Includes many
copyright-free classics (but not ebooks - a different concept).
- Three government publication databases for the US, UK and Australia.
The US and Australian databases are comprehensive. The UK database is
incomplete. The complete database is commercially available
- The book databases of large online bookstores is incomplete but
useful as a fast search of current books. Some include background
information. I use Barnes & Noble, Amazon, Borders and the UK Internet
Bookshop (of the WHSmith bookstore chain).
- The largest libraries of the world, like the US Library of Congress
and British Library hold more than 20 million publications stretching
back many years. The online book catalogues are not good for the latest
books, but are brilliant at earlier works.
- Local libraries and state libraries are noteworthy as finding a book
in their database also means you have found access to these books.
- The definitive resource is the collection of national Books-in-Print
databases like [US] Books in Print, Australian Books in Print, French
Books in Print... These databases are commercially available online, as
print directories (yuck) in libraries and often from publicly available
to search from good bookstores
Book Databases
Information about new books is organized in a collection of national
"Books in Print" databases. This information is publisher-verified,
includes forthcoming titles, and is naturally updated far faster than
the library and bookstore catalogues.
Books in Print, produced by Bowker, delivers publisher-verified
information on US books. British Books in Print is produced by Whitaker
& Sons, delivers publisher-verified information on UK books. Further
national book indexes include Australian Books in Print (Thorpe),
Canadian Books in Print (University of Toronto Press), Les Livres
Disponibles/French Books in Print (Electre), Italian Books in Print,
German Books in Print and others.
All these directories are available as print directories (not
particularly convenient), as a commercial database (through database
retailers), for subscription (bookstores frequently subscribe) or
through Global Books in Print (through not really global, is a group of
book databases).
With regards to the print versions, there may be recent editions in
your state library but don't bother. The directory is not user-friendly
as you must page through each month's subject categories. A more
convenient alternative access point is your favorite large bookstore.
For about Au$4500/year, many bookstores subscribe to Global Books in
Print on CD-ROMs, or a national 'books in print' database. There should
be no cost for searching, but ask for the date and the database name so
you have a clearer idea of what is being searched.
Further Book Resources
Book Reviews are a viable tool in a book search. The tools mentioned
above will give you very little information indeed - mainly title,
author, format and price. You will usually want more than this before
you buy a book.
Book reviews are published in a range of book-related journals and
newspapers. These are compiled into a commercial database of Book
Reviews, like the Book Review Digest by H.W.Wilson or Book Review Index
by Gale Research, or individual book reviews from the like of the New
York Review of Books (http://www.nybooks.com/nyrev/). A state library
may provide access to the Book Review Digest Database.
Online book reviews are further discussed in Locating Book Reviews
(http://www.lib.monash.edu.au/hss/guides/fsreview.htm) by Monash
University Library.
Barnes & Noble, and to a lesser degree Amazon, have additional
information in their book database. Since it is free, it makes for a
fine immediate alternative to searching book reviews.
Future developments in book-related discussion groups holds out more
promise in harnessing the opinions of a book-reading public. Quality
issues remain (and the anonymous musings listed in Amazon.com and
Barnes & Noble
There are also book finding services with specialty book databases -
like a database of second-hand books. Books on Demand is a directory of
out-of print books available for reprinting (and includes price and
order information.)
Strategy
Obviously title searches are not effective tools to discover new books.
Not all books on Vincent Van Gogh include Vincent in the title. Subject
searches, work well only if you can grasp the indexing.
Apply these effective search techniques:
1) Browse the subject listing and select the subjects which interest
you.
2) Read the subject listings off a book you know interests you - then
search for other books in those subjects.
3) Search for other publications from suggestive authors (especially
when the author is an association).
Library catalogues, like LOCIS can illustrate these techniques. Let's
say a title or subject search lands you with one of the books listed in
LOCIS. This catalogue lists the applicable subject titles. Looking at
books placed in the same subject category works well.
A word about Book Types. Just as internet information comes in
different qualities and formats, books also come in different styles
and flavours. Books written by industry insiders are characterized by
personal stories and expert wisdom from an author telling all the
secrets. These books are worth looking for, and the short bio may give
a clue. Books written by Journalists have a different flavour, slightly
more newsy with less factual than, let say, Government books (far more
factual than most), and frequently updated books (far more current than
most). Try to find the style of book suited to your needs.
Information Theory
The book industry has reached a kind of plateau where fairly definitive
databases exist for listing books. There are databases for government
books, out-of-print books, second-hand books, current books. The
internet has changed some elements of this mix, as business models try
to support moving existing databases to free access, and others use
this change to try to present more definitive databases. Book reviews
have never properly been used by the book industry, so the big change
appears to be a move from book titles (as in most book databases and
library catalogues) to rich information (like Barnes & Noble) which
includes reviews and readers comments.
___________________________________________________
The Article
links and more at http://spireproject.com/article.htm
Articles hold a definitive value, a statement of quality and currency.
Sometimes articles are long, unique and informative works. Sometimes
articles are short, simple, trite; a rehash of common knowledge. There
is a range of ways to access articles - though none are particularly
inexpensive. We also have difficulties paying copyright - so most paid
research assistance is restricted to certain, more expensive tools. In
all, articles are cumbersome, cumbersome and time-consuming to work
with. They can also be brilliantly rewarding.
There are three difficulties with article searches:
1_ Finding the articles which interest us.
2_ Getting our hands on a copy. (Many articles you locate may be
impractical to access in person while electronic access can be
expensive.)
3_ Copyright permission, (which can be potentially simple or
exceedingly expensive).
Of course, the main stay of article research is photocopying an article
directly from a journal. Find a library nearby which holds the journal
then read or photocopy it then and there. This process can be improved
by using the online library catalogues (to see if they hold the
journal) and by searching a database of library holdings (often
available for free by asking or calling a librarian at your state
library). As you could expect, some commercial businesses will
undertake this work on your behalf, for a fee.
The difficulty with this process, of course, is this does not help you
discover what articles will interest you - this only works if you have
a useful bibliography to work from.
In recent years, a concerted effort has been made to bring you full
text articles electronically. Commercial databases in general have
moved from being strictly bibliographic to many full text articles. A
system of full text articles on CD-ROM has a brilliant future. Up to
500 journals are updated frequently in this inexpensive format. (Most
Research Libraries have this station.)
Some of the commercial full text databases have emerged online too.
Northern Light presents this. Unfortunately, the better quality
articles are not included in these databases. It is not an absolute
rule but to date, many of these commercial databases are filled with
regional business papers, newspapers or similar middle to low quality
publications.
There is another system for accessing articles, which comes to us from
a very long time ago. Inter-library loans are a system worked out
between libraries so articles can be exchanged between libraries.
Naturally you need the assistance of a library - and a great deal of
patience. Such requests can take over a month to arrive.
Lastly, there is always the option of direct purchase of periodicals
from the publisher.
Commercial Services
Carl Uncover service (fatback articles).
CARL (http://www.carl.org) is one of the great library groups in North
America established a service to provide articles by post or fax. Carl
promises to fax articles provided you use their system to check one of
their many libraries has the required document.
Northern Light - online database of articles
Northern Light (http://www.nlsearch.com) is a search engine of both the
web and their own database of articles available for purchase. The
rates are cheaper than Carl (up to $4.00 per downloaded document) and
the articles are delivered over the internet (not faxed) but the range
is smaller.
Information Theory
Many of the databases will begin to offer their services either as a
pay-per-view, or through reasonable direct subscription methods on the
internet. This has been predicted for years but depends on the
emergence of a fine way to purchase cheap items on the internet:
digital money. No effective digital money has emerged yet, and most
databases will either wait, or try one of the existing incomplete
methods. Essentially, critical mass has not yet arrived, and it now
appears that the true fall in price of information is waiting on an
effective digital money. In preparation, magazines and newspapers are
purchasing all the rights possible - especially the electronic rights.
More appears on this topic later.
___________________________________________________
Webpages
Links and forms at http://spireproject.com/webpage.htm
Webpages are often of unknown age, of only guessed at quality and
potentially the easiest information to retrieve. There are many points
of entry to web resources, but search tools differ. Try to match your
search tool to your question. To start, you will need to learn
something of the different tools - described below - and four basic
search techniques: Boolean, Proximity, Field Searches & Truncation.
Global Search Engines
Altavista (http://altavista.com) includes a very large, fast search
engine. It allows for Basic Boolean AND + NOT - OR | Proximity " " ~
(near - within 10 words of each other.) Several Fields: title:"Spire
Project" domain:gov url:edu link:cn.net.au and Truncation/Wildcard (*)
Of import, Capitals matter with Altavista.
All-the-Web (http://www.alltheweb.com) is important because it is large
- really large - with a flexible search facility. Allows Partial
Boolean + - Simple Proximity " " and Several Fields a title field
search normal.title:spire url field url.all:.au link text and link url
fields normal.atext:spire link.all:cn.net.au All-the-Web is not case
sensitive. The same database supporting All-the-Web supports Lycos.
Inktomi (via http://hotbot.lycos.com) provides its substantial web
directory through other companies, in this case, HotBot. also allows
searches by region, by date, and more.
Debriefing (http://www.debriefing.com) is our meta-search engine of
choice. Use this to find names & named websites. Accepts Partial
Boolean + - Simple Proximity " ". Capitals matter.
Google(http://www.google.com/) is a new style of search engine which
ranks sites with more care and concern. This works well for sites you
know a little about in advance. Unfortunately, has no useful field
searches. Allows Partial Boolean + - Simple Proximity " ".
Unfortunately, No Truncation not even for plurals!
When searching for a topic with precise descriptive terms, use a broad
search engines. Always place the Boolean +symbol before each search
word (like this: +word1 +word2) to insist all words appear in the
results. Quotes keep words together ("word1 word2"). These two simple
steps dramatically improve results. Keep adding words and search limits
until the number of hits is reasonable.
For more global search engines, there are numerous lists to consider
like the W3 Search Engines page at the University of Geneva
(http://cui.unige.ch/meta-index.html#INF) and the Industry Research
Desk (http://www.rbbi.com/links/sengine.htm).
Meta-Search Engines & Google
If you know something of the destination already, like a title or
company name or full name, try using a search tool that excels in
finding named websites. There should be little difficulty in finding
such sites with either Google or a Meta-Search engine, but don't get
excited and use these on other occasions.
Categorized Lists
When searching for information that lends itself to a particular
category or topic, start with resources which group information in
categories. With few exceptions, these resources index websites, not
webpages. Also, keep your search words simple as these are small
databases.
Yahoo (http://yahoo.com) is the largest of this type of directory tree;
the definitive site. Accepts Partial Boolean + - Simple Proximity " "
Truncation * and Several Field t: (for titles) u: (for urls) and a
date field through a form.
The Open Directory Project (http://dmoz.org) is a Netscape effort to,
presumably, mute the strength of Yahoo. It is very good, and very
similar to Yahoo.
Looksmart (http://www.looksmart.com) is another significant directory.
For an alternative, try the World Wide Web Virtual Library: Subject
Catalogue (http://vlib.org/Overview.html), a distributed network of
subject lists, not nearly as dominant as Yahoo, but far more
"scholarly" shall we say. This virtual directory has been around many
years, previously famous from www.w3.org.
Reviewed Sites
When seeking specific fields of study, when topics are clouded with
many similar, low quality sites, start with resources with a greater
degree of personal attention. Peer review and vetting produce resources
with more quality but limited coverage, better suited to this
situation. Also, keep your search words simple.
The Scout Report (http://wwwscout.cs.wisc.edu) is one of the oldest and
most highly regarded e-newsletters introducing new internet resources.
Residing at the University of Wisconsin, the Scout Report describes
research, education & topical sites. The Scout Report Signpost provides
a quick search of previously featured sites.
BUBL (http://www.bubl.ac.uk) is a British site which reviews internet
resources then indexes by Dewey decimal number. I prefer their Dewey
presentation but the collection is not large (though the largest of the
library projects I have seen).
The Argus Clearinghouse (http://www.clearinghouse.net) is a vast
collection of internet guidebooks. We can search the titles &
descriptions, but then click on the highlighted keywords to find
related guides. I suspect Argus is not successfully keeping pace with
internet development.
AlphaSearch (http://www.calvin.edu/library/searreso/internet/as/) is
similar to Argus. This one indexes important nexus sites and should be
browsed.
The Britannica.com (as in Encyclopedia Britannica
http://www.britannica.com) has been remolded as a free guide to books,
periodicals, web and their encyclopedia. This encyclopedia is perhaps
the best.
FAQs can be searched from an FAQ database like the one at
http://www.faqs.org
WebRings list sites by topic. Each webring is maintained by a volunteer
at an uninvolved site using standard software. The primary sites are
currently Webring.com and bomis.com
Specialty Tools
For issues with a particular government, url or language origin,
consider using tools designed with this in mind.
* Altavista can be limited to specific domains (gov edu au) with their
"domain:domainname" field search. "url:url-segment" is also useful.
Read the Altavista Fancy Features for Typical Searches.
* GovBot (http://ciir2.cs.umass.edu/Govbot/) as developed by The Center
for Intelligent Information Retrieval (CIIR) is a search engine which
indexes exclusively a great number of government webpages, a unique
resource.
* Altavista also allows for a field search by language. Searching for a
Japanese site? Consider searching only webpages in Japanese.
* Purely regional search engines may also be the answer. Aussie.com.au,
for example, is a search engine indexing only Australian websites.
There are fine lists of regional search engines and directories like
SearchEngineCollossus, Search Engines WorldWide, SearchEngineWatch and
Yahoo.
* Topic-specific search engines, a new arrival, has a very promising
future. Ideally you will find a search engine like ChemGuide
(http://www.fiz-chemie.de/en/datenbanken/chemguide/)covering over a
million chemistry related pages. Search Engine Guide
(http://searchengineguide.com) and Gary Price's Direct Search.
(gwis2.circ.gwu.edu/~gprice/direct.htm) list topical search engines.
* Lastly, there are some commercial databases aimed at the software and
internet industries. Consider OCLC's NetFirst (articles from magazines
describing the internet).
Conclusion
For many of us, searching the web is simply typing words into a search
engine. I hope I have shown there is more to it than this. What may not
be clearly evident from a brief overview of resources is that each
resource has a particular difference, a particular focus, a particular
angle that helps us answer certain questions faster than other tools
and searches.
Yes, in the simple world of Yahoo and Altavista you pay no attention to
the specific differences between alternatives - you are left with the
worst of these two tools. Your results are general, timeless and
imprecise.
Contrary to myth, global search engines are not the best place to start
most of the time - just some of the time. On other occasions, start
with a directory, a meta-search engine, a guide, an FAQ... We should be
able to identify which tools excel at locating what kinds of webpages.
(There is no simple search of everything.)
There are more insights into effective internet research. Information
clumps; Information is not established in isolation but instead
develops in context, is reinforced, and becomes a trend. The publishing
motivation & promotion purpose can help us rapidly judge the content of
a website. The webpage address can tell us a great deal about both the
website structure and the type of publisher.
Once skilled, you can segment and search the most promising areas of
the web quickly and efficiently. If you do not quickly find your
answers there may be other, more appropriate resources. Consider asking
for help in an appropriate discussion group, or reviewing printed
literature instead. The Web is only one resource among many.
If your primary interest is Search Engines, consider reading A Higher
Signal - To - Noise Ratio
(http://www.dpi.state.wi.us/dpi/dlcl/lbstat/search1.html) by Bob Bocher
& Kay Ihlenfeldt, Sink or Swim: Internet Search Tools & Techniques
(http://www.lboro.ac.uk/info/training/finding/sink.htm) by Ross Tyner
and The Search is Over
(http://www.zdnet.com/pccomp/features/fea1096/sub2.html) by Adam Page.
For even more, read Searching the Internet
(http://wwwscout.cs.wisc.edu/toolkit/searching/) a publication in the
Scout Toolkit and browse Search Engine Watch.
Strategy
Searching the web is more skill than most of us acknowledge. The web is
a manifestation of the demon professional researcher's work with all
the time in the commercial information market. There is constantly the
fear you have missed that single important site with everything.
Consider the researcher's motto:
Someone, somewhere, probably knows the answer.
But how long do we search for gems, and where do we look? To decide, we
must learn about internet structure and organization. Why is
information published on the web? Why is it promoted? Let's review the
reasoning behind effective internet research. There is so much more
than putting words into search engines.
#1 Motivation
We can make some very astute generalizations about a webpage very
quickly if we can judge the reason it was published. Not only is this
an important step in analyzing any information, but this tells us a
great deal about the contents of the webpage.
Yes, merely determining a site belongs to an association actually
specifies the quality, motivation and type of information we will find.
Associations either publish what is termed 'brochureware' (promotional
material), or if well advanced, present research work previously
restricted to the association library: important research studies & the
like. Commercial interests have much more difficulty delivering useful
resources. The importance of projecting a corporate image comes first
(lots of 'brochureware'), and service descriptions come second. On
occasion, commercial interests will support a worthwhile service tied
closely to their own service - thus banks present interest rates -
bookstores present their book database.
The certainty with which we can make these judgments will astound you.
Corporate websites never publish "changes to patent law". They simply
don't have the motivation. Only an individual would publish this, most
likely not on the web but though a mailing list.
Information is not distributed randomly. Consider Format, Preparation,
Motivation and Promotion. Consider this, then Visualize the information
you seek.
#2 Promotion
We can make further snap judgments about web information from the way
you get there. Promotion is very difficult on the web, and it is hard
to find poorly promoted information. The tools you use to reach
information pre-determines the type and quality of information you will
find.
Search engines index webpages indiscriminately. Advertised websites
must have a pay-off. Directories focus on established websites (not
webpages). Link pages also link to established websites but put more
thought into the selection of resources. Both usually focus on general
sites. For specific or current resources, we need to move to mailing
lists or active nexus point.
Yes, when we find a webpage through the Scout Report (a prominent
resource discovery newsletter), we can assume the webpage has a high
quality of information, is reasonably current and has a general appeal
(within the interest of the newsletter readers).
Let's put this in reverse. If we are looking for a recent document by a
prominent library committee, we will not find it through Altavista,
Yahoo, or normal link pages (except accidentally). We may find it
through specialist newsletters, active nexus points, or through mailing
lists.
#3 Visualize
When an artist begins to paint, they visualize the image. They already
have a concept of the finished result. Internet research is no
different. We start by building a vision of the information we seek.
Who would publish it. What is their motivation? Who would promote it?
Where would I find it?
Information Clumps. Information is created, nurtured, develops, gets
transplanted, gets arranged and becomes visible through a process which
brings similar information together. Your understanding of this
process, including motivation and promotion, must guide your search of
the web. Only then will we know where to look, and quickly know if the
answers are on the web.
___________________________________________________
News
links and more at http://spireproject.com/newswire.htm
Shakh was invited to travel with the army on the conquest of Nubia. The
Egyptian army was not in need of further soldiers but there was a need
for a witness. Shakh would write the official chronicles of the army's
exploits. He would be expected to send a simple diary on papyrus back
to the palace and then to compose numerous descriptions for memorial
walls. He may also be consulted for paintings on the pharaohs tomb. It
was a fine offer, and he relished in the prospect of increasing his
value exposure.
The war was not swift, nor was it entirely one-sided. In the end,
superior numbers had its effect and Nubia was once again reunited with
Greater Egypt. Reporting was initially a challenge, since very little
happened from day to day. Slowly, Shakh got a handle on the process and
focussed on the grandness of the venture. Two years after floating up
stream, Shakh was able to do his finest work, the parade of captured
soldiers past the Pharaoh's representative.
- - - - - - - - - - - - - -
News articles are typically light and biased. Do not believe a news
item is a great critical analysis of current events. Most news is
produced under time restrictions, for prompt consumption. In research,
news often proves particularly useful for locating information about
individuals or businesses. News is also critical in creating a timeline
of events, in recording events of regional/national/international
importance.
News prepared by individual reporters is collected together by large
news organizations, then delivered to other news organizations around
the world. Your local news organization does not have a reporter in
Iran, but rather buys the story off a newswire, then packages it in
your evening news hour or morning newspaper.
You have probably heard of: United Press International (UPI), Reuters
Global News, Agence France Presse, Associated Press and Xinhua Chinese
Newswire. These very large organizations make their information
available to you in a variety of ways. News collects in commercial
databases of past news, some single source, others, large multi-source
databases. Current news is also packaged into large multi-source
systems delivered by email or newsgroups. Many newswires are available
online free of charge.
Free News
Critical to the changes on the internet is the emergence of free access
to text news. Individual newspapers present news free. Newswires
present news free. News sections to larger sites like Yahoo present
news from many sources, free. News-only search engines will help you
find information from a great many sites with news.
The process of finding current news is about as slick as imaginable.
Here are a few players in the market:
* Yahoo News (www.yahoo.com/headlines/) is leading this field with web
delivery of current news from Reuters, Associated Press, and others.
Yahoo also includes a free search for one week's news.
* Voice of America Newswire (VoA and now voanews.com) delivers news in
English & many other languages.
* The Washington Post (www.washingtonpost.com) offers their own current
news for searching, as well as the Associated Press wire, each searched
separately for the past week.
* Fox News (www.foxnews.com) presents current news online (both current
events and sport news). CNN news (www.cnn.com) is another searchable
site. Both repackage some newswires and present them online. C|news
(www.news.com) does this too.
* Newsbytes (www.newsbytes.com) is a newswire solely on computer
topics, computer, telecom and online world. InternetWire and other
specialty newswires also present news from their website.
* United Nations Radio: The World in Review is one of many news shows
with the transcripts online. Unusually, the Vatican's newswire is not
free online.
* Obviously many more exist - and thankfully we don't need to create a
list or manage the sources. The Spire Project has a clickable map of
English language newspapers. There are definitive lists of global
newspapers like Gary Price's
http://gwis2.circ.gwu.edu/~gprice/newscenter.htm#International
http://dailyearth.com and http://ipl.org/reading/news/
Commercial Resources
The commercial segment of the news market is obviously being squeezed
by the copious quantities of free news online. There are, however,
still some viable markets, principally enterprise solutions (companies
are willing to pay for slight improvements), past database access, and
surprisingly the Wall Street Journal (US$49/yr).
To these markets we have Clarinet and Newspage. World News Connection
is US Government service presenting translated news (quite a gem) as a
searchable database. Unusually, prices start at US$25/7days - yes one
price for the news!
Of course news alerts can be arranged from the commercial news
databases through the database retailers, and each newswire like Agence
France Newswire, Canada Newswire, Xinhua News and Associated Press all
are unique databases, and all stretch back many years. Further
databases like Newswire ASAP and what used to Global Textline are
massive databases of multiple newswires and newspapers. I recall at one
stage Textline had over 4 billion pages.
Conclusion
News articles are typically light and biased. The sheer quantity of
news in the large news databases make this a useful resource to fall
back for any tightly focused research topic. I once discovered an
obscure scientist working in a unique field from a small 3 paragraph
article in a local farmer's newspaper in England (Global Textline
Database).
Newswires and News Databases are just two elements of a large industry
which extends to the your local newspaper and to further specialty
databases. Most newspapers maintain their own local news database, and
some make this available electronically. A manual clipping services may
also be the option - certain firms manually page through local papers
looking for advertisements or articles.
While on the topic, certain newswires like Business Wire and PR
Newswire essentially distribute certain types of news for money. Yes,
anything in these newswires is there because the company paid for it to
be there - $500 and up most likely. Other newswires earn money in the
reverse process: from the media who read or publish their work.
Associated Press or Reuters are created from news organizations. Others
like Voice of America (VOA) are alternatively funded, but with
reasonable reliability.
There are also a range of focused newswires such as Newsbyte (computer
issues), PR Newswire (product releases), and Middle Eastern newswires.
Further newswires can be found at Yahoo.
Strategy
I can think of four ways to use this information for research:
1) As an alternative to your evening news or morning newspaper. Online
news is available 24 hours a day, in more detail, from respected news
organizations.
2) Search past news to locate information unlikely to emerge in
journals or magazines. News includes a great deal of local detail and
personal information unlikely to be found elsewhere.
3) As a historical record of events, perhaps the basis of a timeline.
4) Current Awareness and Alerts so articles come to you as they are
reported. News stories by email will become a large industry over the
next two years.
Information Theory
Just how inexpensive can news become? US$25 gets you access to past
translated news! VoaNews.com keeps a searchable directory back a month
for free. Many newspapers still have extensive archives of news, though
they hope to one-day charge for them. In a way, no-one is making money
from news. It is only worth the advertising revenue for distracting you
from reading the news - and that is falling too. With the freedom of
moving information through the internet, several free services will
send you email when an news article matches your interests (an Alert).
The future will see much more "compile your own" newspaper - especially
since it could conceivably be compiled at minimal to no expense
depending on the technology (frames anyone?) An intriguing lawsuit
recently stopped TotalNews (a news only search engine) from displaying
news articles in a frame.
If allowed to speculate for a moment, News-for-Pay may also become a
viable businesses. Perhaps this is just being cynical of journalistic
standards and the accepted standards of promotion. Perhaps it is also
recognition that Businesswire and PRWire are just two of several
newswires where you pay to have your news included. Obviously news
today is biased towards advertisers (through advertorials) and
promoters. Perhaps this will become automated some day - like Yahoo's
"we will look at your site right away for $200".
Naturally, the links and many of the forms to news resources discussed
here can be found at http://spireproject.com/newswire.htm and also our
All-in-one page: http://spireproject.com/spir.htm
___________________________________________________
Theses and Dissertations
links and more at http://spireproject.com/discuss.htm
Theses and dissertations are professional papers completed for higher
degrees. That is to say, they are long, dense and often very esoteric
and convoluted. Trouble is, most theses and dissertations have no more
than 12 copies ever - one always to the University Library, one with
the author, but others scatter to the wind.
All University Libraries hold a copy of past theses undertaken at their
university. This gives rise to the unfortunate but necessary pastime of
searching each local university library for relevant theses. The
advantage here is masters and occasionally honours theses are indexed.
Most often, just undertake a keyword search then add "thes*"
(truncation of theses or thesis).
Electronic Theses Databases:
Dissertation Abstracts Online, produced by UMI, delivers abstracts to
most every doctoral dissertation/thesis in North America, some master's
theses and some international theses. This is the definitive site to
search, though you will need the help of your library to see more than
the abstract. Some libraries will have subscribed to Dissertations
Abstracts OnDisc - the CD-version of this database.
The [British] Index to Theses with Abstracts is a print directory by
ASLIB. This publication is also available as a database, available for
site licenses through Theses.com (www.theses.com). This source is quite
comprehensive as can be seen with the University List.
Several other national databases do exist. Here in Australia, a list of
theses was maintained from 1966 to 1991. The Gale Directory of
Databases also lists THESA, a database of French theses, and
Dissertations and Theses of the ROC (Taiwan).
The Australian Education Index (1978+), produced by ACER (Australian
Council for Educational Research), is a directory listing citations and
some abstracts to Australian work in education. Also available as a
commercial database, AEI is bundled into Austrom, a common collection
of Australian databases.
Digital Archives of Theses
In theory, some theses should be available on the internet,
particularly theses lodged electronically. There is a push for
universities to accept electronic thesis submission, and to build
digital archives of theses. The embryonic National Digital Library of
Theses and Dissertations (NDTLD - www.theses.org) is just one such a
project. There is a distributed and sequential keyword search to
participating universities through its not particularly functional. In
theory, this is an incremental improvement to searching library
catalogues.
Conclusion
Getting a thesis can be very difficult. You will need the help of a
document delivery through a library and many theses will not be
available to borrow. You can also buy theses. Read Obtaining Copies of
Dissertations (http://www.library.yale.edu/ref/err/disscops.htm) by
Yale University Library for more. For an alternative look at theses,
consider Locating Theses
(http://www.lib.monash.edu.au/hss/guides/fstheses.htm) by the Monash
University Library.
A note on developments in this field: some Theses abstracts are
emerging online already. Projects like the LA Theses Database
(Landscape Architecture Theses Archive) have much promise but poor
coverage. Full text theses presentation also have promise with the US
Department of Education funding a National Digital Library of Theses
and Dissertations and Virginia Tech starting to request electronic
submission of all theses.
UMI (the producers of Dissertation Abstracts Online) has backed this
move with a direct delivery service of electronic theses to US
libraries for $26, but only theses held in their digital archives are
available. Eventually, large digital Theses archives will be the norm,
but until then, very little will happen in this field.
A thesis is a tightly constrained information package, produced in the
university environment with limited appeal. For economic reasons, we
should not be surprised theses databases are incomplete. The emergence
of theses archives sounds interesting - a good use of the internet -
but does not represent a financial opportunity that could be explored
without government assistance. Consequently, this small area of the
information sphere is government grant-driven.
___________________________________________________
Patents
links and more at http://spireproject.com/discuss.htm
A patent discloses certain facts about a commercially important
invention in exchange for certain rights to exploit the invention. This
is a little simplistic, but explains why patents are factual, unique
from other research resources, and a little vague in certain specifics.
If you have never seen a patent before, see a sample US patent ,
Australian patent, and this brief description
(http://www.ipaustralia.gov.au/patents/P_home.htm).
There are three primary resources involved in patent research. Firstly,
we have the free internet resources. Secondly, we have the national
patent agency resources. Thirdly, we have the commercial patent
databases.
Free Patent Databases
The concept of free patent databases has surely come, and while many
countries are only slowly moving this direction, the movement is
inevitable.
* The US Patent and Trademark Office (USPTO) provides a US Patent
Bibliographic database at patents.uspto.gov with full use of fields,
date and abstract text searching. Choose between their Boolean search,
advanced (field) search or by US patent number. They also maintain a
fulltext [US] Aids Patent Database and other resources.
* The IBM's Patent Server is a public service providing a different
patent database of US Patent abstracts. The IBM service is similar but
different from the USPTO service - certainly not less powerful.
* The Canadian Intellectual Property Office (CIPO) maintains the
Canadian Patent Fulltext Database from '89. This database is on par
with the US Patent Database, with perhaps even better searching
technology.
* The Japanese Patent Office (www.jpo-miti.go.jp) has a searchable
database of Japanese patent abstracts, including patent number, title,
inventor, company, and abstract of the patent.
Patent Authority Services
Patent libraries are an important and cost-effective patent resource.
* IP Australia (www.ipaustralia.gov.au) (formerly the Australian
Industrial Property Organisation (AIPO)) has a patent library in each
Australian state capital. Each library provides free access to the APAS
database (Australian Patent Abstract Search) and includes a complete
microfiche copy of all Australian patents and the Australian Official
Journal of Patents, Trademarks & Designs (the official Australian
patent gazette).
Most offices also hold US Patents on microfiche! Staff will help you
use the APAS database, arranged for free text searching by
International Patent Classification. A particularly useful service by
IP Australia is the delivery of copies of many foreign patents for
AU$15. You will need the patent number, country and title for this.
* The US Patent and Trade Mark Organization (USPTO) has the Patent and
Trademark Depository Library Program (PTDL's) placing the CASSIS
database (The USPTO patent abstract database on CD-ROM) and US patents
around the US.
The US patent libraries also hold the Official Gazette of the U.S.
Patent and Trademark Office, The official US patent gazette.
Importantly, the gazette is fully online and searchable from 1995.
* The [UK] Patent Office (www.patent.gov.uk) provides for the Patents
Information Network (PIN) which hosts patent information in the UK. The
British Library is just one listed source of UK patents (further
information online) and delivers some patent services.
* The Canadian Intellectual Property Office (CIPO) (cipo.gc.ca)
produces the Canadian Patent Index (CPI). They also publish The Patent
Office Record, Canada's official patent gazette.
* There are many more national & international patent organizations
like Intitut National de la Propriete Industrielle [France], World
Intellectual Property Organization (WIPO) and European Patent Office.
Thankfully there are fine lists of patent libraries and patent
websites.
Commercial Patent Services
One of the most invaluable resources in serious patent research is
access to several of the very large commercial patent databases.
* Lexis-Nexis (www.lexis-nexis.com) retails several patent databases.
Thanks to Patscan (University of British Columbia), we also a guide to
searching patents on Lexis-Nexis.
* The Dialog Corporation (www.dialog.com) retails a collection of
patent databases including: Derwent World Patents Index, Inpadoc,
Claims/U.S. Patents and European Patents FullText.
* CASSIS is the USPTO database. For a little more information on this,
consider the Patent Guide to Using CASSIS, at the University of
Michigan.
* Derwent Scientific and Patent Information (www.derwent.co.uk) is a
prominent publisher of Patent and scientific information including
commercial databases.
* Questel-Orbit (www.questel.orbit.com) also retails patent databases.
* CAS/STN (www.cas.org) retails a collection of patent databases
including Chemical Patents Plus for U.S. Chemical patents.
In addition to the database retailers and producers, there is a lively
industry of patent services.
* The Patent Libraries will assist you with some services. IP
Australia, for example, will retrieve most full patents from other
countries for AU$15.
Conclusion
Until recently, the legal profession has had a complete monopoly on
patent work. As you can see, this need no longer be the case. Casual
researchers will find the free patent databases easy to use, and more
experienced researchers should not be dissuaded from searching the
commercial databases or patent libraries themselves. The very large
commercial databases, like Inpadoc, are particularly easy to use.
Of course, there are occasions when patent searches are critical, and
experts should be sought. Certainly legal assistance is required if you
are preparing to lodge your own patent, but patent data as a source of
information is another matter.
As an industry, patent research is still deeply entrenched in the
high-price commercial database and database-centered services. I am
mildly surprised the emergence of free databases like the USPTO's
patent database has not led to a fall in the costs of the high-end
databases (which remain some of the most expensive databases in
publicly accessible). It appears this industry, as indeed several
others, has no intent to drop the price of retail database access to a
more supportable level. I can only predict this rests on economic
grounds. Patent information purchases are price insensitive.
___________________________________________________
Statistics
links and more at http://spireproject.com/stats.htm
Statistics allow us to lie with confidence. Dense and factual,
carefully interpreted statistics are also far more reliable than
personal experience. The expense of collecting meaningful statistics
limits the types of organizations involved in this work. This divide is
also a very elegant way to divide this field.
#1 National Statistical Agencies,
#2 Government Agency Statistics,
#3 Commercial Statistics,
#4 Association Statistics.
Statistical Directories
Statistical Abstracts (statistical bibliographies and statistical
directories) describe sources of statistics.
Instat publishes "International Statistics Sources: subject guide to
Sources of International Comparative Statistics" but I found this less
than brilliant. A better link is Statistical Sources (by Gale
Research), a basic and very large statistical abstracts directory.
On the internet, US government statistics are well recorded in
Statistical Abstract of the United States 1999
(http://www.census.gov/stat_abstract) a 1000+ page document made
available online in pdf format by the US Census Bureau.
Statistical Venues
Many statistics appear regularly in journals, annual reports and
newspapers. Specialty libraries, particularly specialty librarians, may
be aware of additional statistics.
If an expert goes through the effort to collect statistics, you are far
more likely to locate them by undertaking an article search, (looking
particularly for journal articles) and a book search. In both cases,
limit your search to only the last couple of years or you will locate
very old, dated statistics. A particularly sophisticated approach could
be to ask BusLib-l (Business Librarians' Electronic Discussion List)
since this is a mailing list of librarians. Use this resource
sparingly, and only after having exhausted other avenues.
National Statistical Agencies
Most every country in the world has a single government agency
dedicated to collecting, collating and publishing national statistics.
Statistics Canada, Australian Bureau of Statistics, The US Census
Bureau, The (UK) Office for National Statistics; we have a fine page on
national statistical agencies (http://spireproject.com/bureau.htm).
These organizations manage the census, watch the movement of money and
goods in and out of the country, and undertake a wide range of other
surveys. Finding these statistics is relatively straight forward, with
several directories on the internet.
Government Agency Statistics
Most government agencies collect reams of data on the industries they
monitor. Sometimes these statistics are published, sometimes you have
to ask for them, only rarely are they considered private or
unavailable.
Here in Western Australia, the government departments for Tourism,
Labour, Small Business and Big Business all publish top-rate statistics
free to interested parties. Our Dept of Tourism keeps a directory of
future tourism related projects.
When government statistics are bound and published, try the government
book databases. Remember MOCAT, AGIP and part of UKOP are free online.
Again, some US government statistics are well recorded in Statistical
Abstract of the United States 1999 by the US Census Bureau, online in
pdf format.
Association Statistics
Valuable statistics only come from motivated sources, and associations
are certainly motivated. Start with a list of likely associations, then
call up and either explain you needs or ask for their price list for
publications and statistics. For AU$25, the Australian Booksellers
Association publishes a brilliant analysis of the book industry.
Association statistics are financially informative, as the intended
audience is association members.
Commercial Statistics
Statistics created for sale are frequent in the financial sector but
exist in a number of further situations. Banks use more professionally
prepared market reports such as reports by the Australian economic
consultancy firm Syntec Economic Services, Guide to Growth, which
examines Australian industries financially with forecasts. IBIS
(www.ibis.com), another economic consultancy, also publishes to this
market.
Professionally prepared market reports are also emerging, with the full
text immediately from the commercial information market. Each database
retailer has several such databases, but often these databases are
focused globally or in a different country. Sheila Webber
(http://www.dis.strath.ac.uk/people/sheila) has a very good list of
firms which market research reports.
Conclusion
Central to the Internet Revolution is the liberation of just this kind
of information. Increasingly, we will see the publishing of such
documents on the internet, but for the few statistics currently online,
there is no effective search. You can only browse government websites.
Away from the internet, you must either contact the agencies directly
(in the hope they do collect statistics), look at the statistical
directories or seek agency statistics in other documents: books,
pamphlets, newsletters.
Once you have proceeded this far, it is wise to stop looking for
statistics, and begin again at sophisticated commentary - which is
likely to include supporting statistics or references to statistics
anyway. Seek expert guidance from others who would know of hard-to-find
statistics.
One approach to finding statistics is to reverse the process. Who would
prepare the statistic? Statistics are created in a logical manner, in a
very expected manner. Tourism statistics? - most likely undertaken by
either the government tourism authority, a tourism association or the
national statistical agency. There are few others who could even
consider preparing tourism statistics. If you can think through the
preparation process, you can usually identify who would have created
the statistic. (Internet statistics are the exception - too many
organizations are creating statistics of worth.)
Let's move on to specific fields of statistics.
National Statistical Bureau
The Spire Project has a fine html article on the National Statistical
Agencies (http://spireproject.com/bureau.htm). Australia
(www.abs.gov.au), United Kingdom (www.ons.gov.uk), Canada
(www.statcan.ca) and United States (www.census.gov) all have national
statistical agencies. Each organization collects and publishes
statistics on many facets of their respective countries. This article
should simplify your work in searching, selecting and appraising these
sources.
Each statistical agency organizes their statistics in a distinct way.
The Australian Bureau of Statistics (ABS) has an annual Catalogue of
Publications but also a search function, specialized statistical
category guides and several periodicals on new resources. The UK Office
for National Statistics (ONS) has a statistical overview, product
catalog and a search. The US Census Bureau has a collection of very
large publication catalogues, directories and periodicals. Statistics
Canada has several searches, publications and a catalogue
The two further elements to the statistical agencies are the
statistical libraries and the unreported commercial statistics. The ABS
has a dedicated statistical library within each Australian state, and
collections of ABS documents within most public and school libraries.
While the ABS documents within libraries are limited, the ABS libraries
are very detailed with most every publication they create available for
review. This is standard throughout the world.
While publications are sold by each statistical agency, and the
publication catalogues are available online, each agency has data they
sell in other formats. CD-ROMs of popular geographical and statistical
distribution have become very popular, as have small area population
statistics. Some of these services are packaged and sold for specific
purposes, like 4-site by the ABS used in describing business locations.
Even further, statistics can be generated specific to your needs. This
might include ABS import and export statistics for specific
commodities, or specific results from any of their surveys.
Lastly, Usinfostore.com presents a collection of economic indicators as
time-series data. The statistics originate from several government
agencies and is best considered as a value-added service: an intriguing
beneficial trend?
National Statistical Agencies are certainly not the only source of
statistics. They are, however, some of the easiest to access. These
agencies also have several traits that distinguish them from other
information sources.
Firstly, these agencies are legally required to disguise their
statistics to protect the identity of specific businesses and
individuals (with the exception of the Business Register). If there is
only one or two timber exporters in Western Australia, the ABS will not
give you timber exports from Western Australia. Specifics are found in
directories like Kompass, commercial databases, or insider information
(experts and articles by experts).
Secondly, national statistical agencies have a tendency to be old. Most
surveys are not completed annually, but rather every two, three or more
years. Census data is older still. The analysis process also adds a
delay. The ABS tends to take a year or more to collate and analyze
statistics. For Legal and Accounting Services Australia we have '92-'93
statistics, and the '95-96 statistics are due to be released early Nov
1997. Certain statistics like National Indicators are rapidly produced,
but most are not.
Thirdly, national statistical agency publications are detailed - far
more than most statistical publications. Commercial statistical sources
often neglect supporting information like sample size and demographic
breakdown, but expect these publications to include this and more.
Publications may still require further analysis, and may occasionally
come from inferior sources of information, but they are professionally
delivered.
There are several ways to search each agency: (1)
Each agency has thoughtfully provided their catalogue of publications
online. The links are above.
(2) Each agency collects certain information for analysis. It is
helpful to become familiar with the various surveys and information
sources used by each agency.
Besides the Census, the ABS conducts surveys of weekly household
expenditure, agricultural land-use surveys, R&D surveys, and periodic
surveys of various segments of the economy (like Legal and Accounting
Services, Australia 1992-93). They also collect landing cards (tourism
information), export and import documentation, regional hotel occupancy
rates and more. Each statistical agency is similar.
If the Australian Bureau of Statistics (ABS) has not yet conducted a
survey of hospital occupancy, they will not have this information.
(3) Agencies publish guides to information on a particular topic. They
also publish various newsletters of recent releases and annual
yearbooks too.
National Statistical Agencies are not the only statistics, nor
particularly the best. They are, however, often the best source for
demographic data, widely used by government and frequently re-published
in other government documents. These agencies also provide a range of
sample and national summary data directly from their website. Online
statistics have not yet been organized, so I rather expect browsing the
website for free information will be unwise, unless you are looking for
simple national data.
___________________________________________________
This document continues as Part 3/6
___________________________________________________
Copyright (c) 1998-2001 by David Novak, all rights reserved. This FAQ
may be posted to any USENET newsgroup, on-line service, website, or BBS
as long as it is posted unaltered in its entirety including this
copyright statement. This FAQ may not be included in commercial
collections or compilations without express permission from the author.
Please send permission requests to ***@spireproject.com