Notice: This file is archived for historical purposes only and is not being updated. Please see the index for updates.
- Top 5 Keyword: Top of the Web entry for keyword searching tools, from December Communications, Inc. (http://www.december.com/web/top/keyword.html)
- Web Spiders: World Wide Web Robots, Wanderers, and Spiders, by Martijn Koster (http://info.webcrawler.com/mak/projects/robots/robots.html)
-
BotSpot: The Spot for all Bots on the Net including 13 searchable bot classification databases, FAQ's, Articles, Conferences, New Bots, Add a Bot, CommerceBots, NewsBots, SearchBots and more. Visit BotSpot of the Week awarded by Team BotSpot (http://www.botspot.com)
----- A selected list of Web spiders:
- Crawler: gathers indexes of the total contents of documents, as well as URLSs and titles (http://webcrawler.com)
- Excite: keyword or concept searching of Web pages or Usenet (http://www.excite.com)
- Harvest: tools to gather, extract, organize, search, cache, and replicate information available over the Net (http://harvest.cs.colorado.edu)
- InfoSeek: index search of the Internet (http://www.infoseek.com)
- JumpStation: indexes the titles and headers of documents on the Web, by Jonathon Fletcher (http://www.stir.ac.uk/jsbin/js/)
- Lycos: uses information metrics to record the 100 most important words in a document, along with the first 20 lines, so that users can often determine the value of a WWW document without retrieving it (http://www.lycos.com)
- MOMspider: a spider that you can install on your system (Unix/Perl) (http://www.ics.uci.edu/WebSoft/MOMspider/)
- NIKOS: allows a topic-oriented search of a spider database (http://www.rns.com/cgi-bin/nikos)
- RBSE URL: a database of URL references, with full WAIS indexing of the contents of the documents, by David Eichmann (http://rbse.jsc.nasa.gov/eichmann/urlsearch.html)
- SavvySearch: Parallel Internet Query Engine (http://www.cs.colostate.edu/~dreiling/smartform.html)
- SG-Scout: a robot for finding Web servers (http://www-swiss.ai.mit.edu/~ptbb/SG-Scout.html)
- Wandex: index from the World Wide Web Wanderer, by Matthew Gray (http://www.mit.edu:8001/cgi/wandex/index)
- WebAnts: a project for cooperating, distributed Web spiders (http://thule.mt.cs.cmu.edu:8001/webants/)
- WebWorm: gathers information about titles and URLs from Web servers, by Oliver McBryan (http://www.cs.colorado.edu/home/mcbryan/WWWW.html)
Definition: Spiders are a class of software programs that traverse network hosts gathering information from and about resources.
----- Lists, information and collections