Skip to Main Content

List of Web Crawlers

Web crawlers are systems that crawl web pages for various reasons. Crawler log file referrer string in bold.

Search Engine Crawlers

Search Engine crawlers crawl pages for possible use in a search engines index.

Baiduspider+(+http://www.baidu.com/search/spider.htm)
Crawler for Baidu

Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)
Crawler for Ask.com

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Crawler for Google

Mozilla/5.0 (compatible; Yahoo! Slurp China; http://misc.yahoo.com.cn/help.html)
Crawler for Yahoo China

Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Crawler for Yahoo

Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)
Crawler for Cuil

Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 (support.voilabot@orange-ftgroup.com)
Crawler for Voila

msnbot/1.1 (+http://search.msn.com/msnbot.htm)
Crawler for MSN/Live Search

Image Search Engine Crawlers

Image search engine crawlers crawl for images for image search engines.

Googlebot-Image/1.0
Crawler for Google Image Search

Mozilla/5.0 (Yahoo-MMCrawler/4.0; mailto:vertical-crawl-support@yahoo-inc.com)
Crawler for Yahoo Image Search

msnbot-media/1.0 (+http://search.msn.com/msnbot.htm)
msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)

Crawler for MSN/Live Search Image Search

psbot/0.1 (+http://www.picsearch.com/bot.html)
Crawler for Picsearch

Brand Monitoring Crawlers

Brand monitoring crawlers collect information on what people are saying about brands on the Internet.

R6_CommentReader(www.radian6.com/crawler)
R6_FeedFetcher(www.radian6.com/crawler)

Crawler for Radian6

http://www.relevantnoise.com; info@relevantnoise.com
Crawler for RelevantNoise

TechrigyBot - www.techrigy.com support@techrigy.com
Crawler for Techrigy

Other Crawlers

Morfeus Fucking Scanner
Scans for vulnerabilites in on websites.

libwww-perl/5.65
Software library for downloading pages, can be used for a variety of purposes.