Link to the mod (https://custom.simplemachines.org/index.php?mod=1157)
MORE SPIDERSBy Karl Benson (https://custom.simplemachines.org/index.php?action=profile;u=63186) | Link to Customization (https://custom.simplemachines.org/mods/index.php?mod=1157) | Support Topic (https://www.simplemachines.org/community/index.php?topic=233636)
IntroductionAdds 88 more spiders/crawlers/bots to your Spiders section in SMF
SMF SupportVersion | Supported |
2.1.x | Yes |
2.0.x | Yes |
1.x.x | No |
Features- 83 Spiders belonging to search engines, validators, checkers, crawlers, bots, software, etc.
- Including; Facebook, Ask, Baidu, GigaBot, Google-AdSense, Google-Adwords, Google-SA, Google-Image, Bing, InternetArchive, Alexa, Omgili, Speedy Spider, Yahoo, Yahoo JP, DeadLinksChecker, W3C Validator, W3C CSSValidator, W3C FeedValidator, W3C LinkChecker, W3C mobileOK, W3C P3PValidator, Bloglines, Feedburner, SnapBot, Picsearch, Websnapr, AllTheWeb, Altavista, Asterias, 192bot, AbachoBot, Abdcatos, Acoon, Accoona, BecomeBot, BlogRefsBot, Daumoa, DuckDuckBot, Exabot, Furl, FyperSpider, Geona, GirafaBot, GoSeeBot, Ichiro, LapozzBot, Looksmart, Lycos, Majestic12, MLBot, MSRBOT, MSR-ISRCCrawler, Naver, Naver, NoxTrumBot, OmniExplorer, OnetSzukaj, ScrubTheWeb, SearchSight, Seeqpod, Shablast, SitiDiBot, Slider, Sogou, Sosospider, StackRambler, SurveyBot, Touche, Walhello, WebAlta, Wisponbot, YacyBot, YodaoBot, Charlotte, DiscoBot, EnaBot, Gaisbot, Kalooga, ScoutJet, TinEye, Twiceler, GSiteCrawler, HTTrack, Wget
(Remember SMF detects most Google/Yahoo/MSN bots by default)
Spider ListIt appears that most sites offering spider/bot lists have tonnes of inactive ones. I'm putting together my own lists by detecting them in the wild on my own sites. Plus any that get reported to me (after I've checked them out). So if there are any ACTIVE ones I'm missing? Please let me know in the support topic.
InstallationAny previous versions of the mod does NOT need to be uninstalled.This mod adds a row for each spider in the database only.
- No theme edits required.
- No language strings to translate or editing.
- It adds database rows only.
- It will ignore adding ones which already exist.
Install and your done.
Note: Click uninstall to remove the mod from your mod list. But it won't remove the spiders from your database. You'll need to remove each one from your SMF Search Engines/Spider section.Manual InstallationFor manual installation, just upload AddMoreSpiders.php to your SMF directory and run it in your browser (then delete the file).
Useful Links- Manual Installation Of Mods (https://wiki.simplemachines.org/smf/Manual_installation_of_mods)
- How Do I Modify Files? (https://www.simplemachines.org/community/index.php?topic=24110.0)
SupportPlease use the modification thread for support with this modification.
(Please don't ask me to do the edits for you)
Got it.
great Mod
Works fine on my board. :P Nice work.
I like your mod but I'm using 1.1.4
anyway , thank u karlbenson for effort
D.S
will this mod work on the 1.1.4 boards? seems okay just disappointed that there
isnt one for the 1.1.4
No.
This mod only adds rows to the default 2.0 feature of search engine spider tracking.
However I use 1.1.4 on my own forum, and what I've done is used the spiders from this mod in the Googlebot & Spiders mod (replacing the existing spiders in that mod which may be a little out of date/doesn't detect the newer ones).
let me known if you want me to attach that as an attachment with some manual directions.
excellent as always
Quote from: karlbenson on April 13, 2008, 10:02:29 AM
let me known if you want me to attach that as an attachment with some manual directions.
Yes, plz karlbenson if u can that , I wish to help us
D.S
Ok here it is attached with manual instructions.
There are a handful in here, not in the more spiders bot yet. I've been waiting to detect them on my forum.
Edit: Updated attachment 20th May 2008 - v1.2
thank u dude for help me ..
D.S
Let me know if there are any problems.
I've been using it on my forum (http://www.youposted.com) for the past week without issue.
And have noticed all kinds of different crawlers/spiders/bots on my forum.
I've now seen about 90%+ of all these bots on my own forum. Including some more. Like a Russian SE spider. Not sure why as my forum is entirely in English.
karlbenson do you have the same mod for SMF 1.1.4? Awesome mod by the way!;)
As stated above. Its 2.x only because ALL this mod does is add more rows of spiders to the default 2.0 feature of search engine spider tracking.
There is no such feature in 1.1.x.
However I use 1.1.4 on my own forum, and what I've done is used the spiders from this mod in the Googlebot & Spiders mod (replacing the existing spiders in that mod which may be a little out of date/doesn't detect the newer ones).
That is what I've attached further above.
ok thanks i will do the same then:)
Another great mod!
Thanks.
Very cool, thank you! 8)
whats the chances of you doing a version for 1.1.4 spiderbot updates but just the spiders only
something like an addon to the 1.1.4 version..
I probably won't do it as a add-on mod to the googlebot mod.
But I'll continue to update the attached list (as long as my own forum remains at 1.1.4)
Its a single edit taking less than a minute to do.
http://www.simplemachines.org/community/index.php?topic=233636.msg1507179#msg1507179
What exactly does this do?
In SMF 2.x it adds the ability to detect (and optionally restrict) spiders on your forum.
By default only some googlebot/slurp(yahoo)/msnbot are detected.
This basically adds many many more.
Just wondering though, what good do spiders do on your forum?
Spiders get you in search engines.
If you don't allow spiders, then don't expect to appear in Google, Yahoo, MSN or any other search engine.
Some of the spiders/bots/crawlers in this mod are not for search engines, but tools like W3C Validator, so you can see when its being run on your forum.
I just installed the " Googlebot & Spiders " mod and made the changes you said to change and now my homepage won't load.
Did it work after installing the mod, but before making the changes I posted?
You may have made a slight mistake (so get a white page)
Double check your edits.
If you still can't spot it, upload your Sources/Subs.php here and I'll take a quick look.
I deleted all the edits I made an will re-attempt later and let you know :).
thank you very mach for the mod :D
In this (http://www.simplemachines.org/community/index.php?topic=19243.0) who.template i don't have a "$known_spiders = array ("
I have a " $known_agents = array ("
Is it the same? Can i had the spiders' list you provided to this who.template?
Thanks once more.
I'm not sure what your using it with.
The attached edits I posted are for the Googlebot mod, which has them in Sources/Subs.php.
Not who.template
1.1 - 4th May 2008
o Fixed Alexa/InternetArchive
o 25 More Spiders added (which have been detected on my forum (http://www.youposted.com/) in the past month)
Quote from: karlbenson on May 02, 2008, 07:15:34 PM
I'm not sure what your using it with.
The attached edits I posted are for the Googlebot mod, which has them in Sources/Subs.php.
Not who.template
Got it ;) Thanks again!
Will it work the same if I copy the spider list into the who.template?
@rumfa are you referring to Who.template.php spiders mod
http://www.simplemachines.org/community/index.php?topic=19243.0
You would have to manipulate the array as posted in the attachment.
ok i did it. Added the whole list. How do i add a custom spider? I have some local spiders here. It is:
(85.10.36.115, Mozilla/5.0 (compatible; Pogodak.co.yu/3.1))
Do i just add the following?
Quotearray (
'agent' => 'Pogodak',
'spidername' => 'Pogodak',
'spider' => true,
),
And afcourse do the same in the subs.php but without 'spider' => true,...
yes.
There's one I've seen a lot that isn't in your list (haven't looked at the regular 2.0b code yet):
oBot = Cobion.com
From what I could glean from the skimpy info on their site last year, it was most likely hunting for copyright violations. It was producing twice as many hits on the database per day as Google and Yahoo combined. I banned it in the robots.txt file and it ignored the ban, so I blocked the sucker in the .htaccess file.
That forum didn't have any problems with hot software. Not all of us are into warez...
Theres a few companies employed by the content mafiaa who go around browsing sites.
However I've not included any of them. Mainly because most people wouldn't want them showing on the list.
Thanks for reporting anyway.
Here are two more who come 4-7 at once :S
Mozilla/3.0 (x86 [en] Windows NT 5.1; Sun)
ip's (probably not all but yet..)
217.23.31.154
61.35.100.131
218.28.213.194
76.104.218.228
Mozilla/4.8 [en] (X11; U; Linux 2.4.20-8 i686)
I have no ip for now...
Those will be proxies or content scrapers/email harvesters/hackers/spammer etc
1.2 - 20th June 2008
o Added a handful of new spiders which have been detected in the past month
Quote from: FragaCampos on May 02, 2008, 06:03:29 PM
In this (http://www.simplemachines.org/community/index.php?topic=19243.0) who.template i don't have a "$known_spiders = array ("
I have a " $known_agents = array ("
Is it the same? Can i had the spiders' list you provided to this who.template?
I am also using the custom who.template.php for 1.1.5 and see the same $known_agents = array (...
array (
'agent' => 'sogou spider',
'spidername' => 'Sogou spider',
'spider' => true,
),
array (
'agent' => 'Sogou',
'spidername' => 'Sogou',
),
I'm guessing if I added 'spider' => true to your spiders.txt, I can then replace who.template.php's array with yours. Would that be correct?
looks like it yes.
Quote from: karlbenson on June 20, 2008, 07:01:04 PM
1.2 - 20th June 2008
o Added a handful of new spiders which have been detected in the past month
How about a new .txt file with the updated list for 1.1.x users to update our sub.php file? :) Pretty please?
edit:
Or is this list up-to-date?
http://www.simplemachines.org/community/index.php?topic=233636.msg1507179#msg1507179
its a very very good add on for my forum!
@LanceDean
Done! - Added/Updated at that link.
Thank you, karlbenson!
I forgot when i updated it to update that list aswell.
New search engine out with new spiders. Cuil (http://www.cuil.com)
Here is their info on their Spider (http://www.cuil.com/info/webmaster_info/)
How do I use the .txt file you attached?
@SgtMic
Thanks however, Cuil is already detected.
(their spider is called Twiceler).
In fact you'll notice Sgtmic that Twiceler is banned from alot of the web for robots abuses. (including mine).
@olabaz
The .txt file is basically the same spiders in my mod, but for the Googlebot and Spiders mod (which is for SMF 1.1.x).
You'll need to edit the spiders in Sources/Subs.php, and replace with the list I posted.
You need to edit as per the instruction in the .txt.
But I can't find the line that it says to find in the txt file
Quote from: karlbenson on July 29, 2008, 03:38:35 PM
@SgtMic
In fact you'll notice Sgtmic that Twiceler is banned from alot of the web for robots abuses. (including mine).
Interesting.
But I can't find the line that it says to find in the txt file
@olabaz
Are you using SMF1.x?
Is the GoogleBot and Spiders mod installed?
The answer to both of the above should be YES.
Then edit
Sources/Subs.php
Find the array of spiders which starts
$known_spiders = array (
Then replace them.
I got it perfectly working in my 1.1.5 Forum.
Your Notes are extremely helpful Karlbenson.
I have One request.
Could i set something like only admin is allowed to see the spiders strength in board index and other pages.
Other users could see the normal fn. like no of guest and others only.
If this is been already discussed, please take me there.
Thanks for this wonderful modifications Karlbenson.
Hi , i would like to update the code work in my Dilber MC theme.
Dilber MC Theme as its own board index template page.
Could you help me to make the code show this much of bots in my Dilber MC forum index page.
This mod works for SMF 2.x only.
This mod works on ALL themes, since it only contains database entries.
No edits to files are made by this mod. All it does is add lots of rows to the search engines table in your database.
You need to enable the 'search engines' feature of SMF inside your admin panel.
If you using smf 1.1.x, then you'll need to use Googlebot and Spiders mod.
In Default Theme,
120 Spiders, 370 Guests, 16 Users, 0 Users in Chat (9 Buddies)
In Dilbermc Theme,
365 Guests, 17 Users, 0 Users in Chat (9 Buddies)
Does not shows the spiders count and spiders list in dilber mc theme.
You'll need to raise the issue separately on smf.
I cannot help you with the edits, since this mod doesn't make any edits, but add to existing smf code.
Ok ok i understood.
Will start a new topic neither update the google bot & spider topic.
Thanks for your help Karlbenson.
--------- Updated : -------------
http://www.simplemachines.org/community/index.php?topic=38003.msg1689961#msg1689961
Another great mod :D
Hope this mod will compatible with SMF 1.1.6 because I'm using 1.1.6.
Hope something will create a mod for SMF 1.1.6 above. Because I really like this mod.
This mod uses the built in features of SMF 2.0 to add more spiders to its spider list. I don't have intentions of backporting this mod when others such as Googlebot and spiders mod does exist which accomplishes this :)
This mod is posted under 1.1.7 mod section while it works for SMF 2.0 Beta3 and above. So it would be great if the moderators could remove this from 1.1.7 section and add it to mod 2.0
I don't understand? The mod shows that it is only compatible with 2.0 Beta 3, 2.0 Beta 3.1 and 2.0 Beta 4.
For some reason this mod showed up under 1.1.7 section. But now when I look at it, its not there.
I dont understand what happened. Did somebody moved it? Or was I looking at the wrong section (I doubt it). Either way, it is in the right place now. Thanks
Nobody has touched it. I did look at its page and see if possibly it got in there, but it wasn't showing that it was for 1.1.7 at all.
Fatal error: Call to undefined function db_extend() in /home/xxxxx/public_html/Packages/temp/AddMoreSpiders.php on line 21
i have tihs error.What s wrong?
This modification is only for SMF 2.0 or higher. This will not install for SMF 1.1 or 1.0
I am using version 1.1.7 I have tried to install the More Spiders v1.2 I get this error when I click install:
Fatal error: Call to undefined function: db_extend() in /home/content/m/o/r/XXXXXXXXX/html/support/forums/Packages/temp/AddMoreSpiders.php on line 21
Quote from: mtechama on February 01, 2009, 12:30:46 PM
I am using version 1.1.7 I have tried to install the More Spiders v1.2 I get this error when I click install:
Fatal error: Call to undefined function: db_extend() in /home/content/m/o/r/XXXXXXXXX/html/support/forums/Packages/temp/AddMoreSpiders.php on line 21
Right above your post is:
Quote from: SleePy on January 20, 2009, 06:35:26 PM
This modification is only for SMF 2.0 or higher. This will not install for SMF 1.1 or 1.0
Is this mod compatable with SMF 2 RC1?
(Doesn't state it does on the mod page yet.)
Yes
Quote from: karlbenson on April 13, 2008, 04:26:14 PM
Ok here it is attached with manual instructions.
There are a handful in here, not in the more spiders bot yet. I've been waiting to detect them on my forum.
Edit: Updated attachment 20th May 2008 - v1.2
I found Baiduspider in my online guest list and not the spiders list, I have edited subs.php with the list given in spiders.txt, but it's still not recognising all of them as Baiduspider is on the list. The only thing I can see that is possibly different is this one calls itself Baiduspider+
I have attached my subs.php if you would be so kind as to take a look at it and see if I did it wrong
Many thanks
I have been watching and I noticed quite a few spiders that come on that are in your addspiders file that still show as guests and not spiders (I have the grouped guest, members, spiders they stay in the guest section and don't move to spider section) any idea why?
Quote from: Chit-Chat ChatterBox Boss on May 03, 2009, 04:48:05 PM
I have been watching and I noticed quite a few spiders that come on that are in your addspiders file that still show as guests and not spiders (I have the grouped guest, members, spiders they stay in the guest section and don't move to spider section) any idea why?
Fixed the problem :D
Quote from: Chit-Chat ChatterBox Boss on May 12, 2009, 12:18:09 AM
Quote from: Chit-Chat ChatterBox Boss on May 03, 2009, 04:48:05 PM
I have been watching and I noticed quite a few spiders that come on that are in your addspiders file that still show as guests and not spiders (I have the grouped guest, members, spiders they stay in the guest section and don't move to spider section) any idea why?
Fixed the problem :D
good for you! :-X
from which option in admin i can set show to bots option
If Search Engine Tracking is enabled, just go to Admin -> Members -> Search Engines.
You then can change the settings to show on the online list.
Hello Dear
Whenever i wish to "Install Now" Button
everything is fine but it gives error on next page saying
Fatal error: Call to undefined function db_extend() in /home/imscie/public_html/chillpoint/Packages/temp/AddMoreSpiders.php on line 21
Please Help
Quote from: Naveeddil on August 23, 2009, 01:46:36 AM
Hello Dear
Whenever i wish to "Install Now" Button
everything is fine but it gives error on next page saying
Fatal error: Call to undefined function db_extend() in /home/imscie/public_html/chillpoint/Packages/temp/AddMoreSpiders.php on line 21
Please Help
Are you using: SMF 2.0 Beta 3 Public, 2.0 Beta 3.1 (and above)?
It will not work for ANY earlier versions.
Quote from: kai920 on August 23, 2009, 11:27:25 AM
Are you using: SMF 2.0 Beta 3 Public, 2.0 Beta 3.1 (and above)?
It will not work for ANY earlier versions.
I'm using SMF 1.1.10
does it work with it?
Quote from: kai920 on August 23, 2009, 11:27:25 AM
Are you using: SMF 2.0 Beta 3 Public, 2.0 Beta 3.1 (and above)?
It will not work for ANY earlier versions.
okay thanks
2.0 RC1.2??
Is there going to be a new version of More Spiders for version of SMF 1.1.10 by chance the verison i have doesn't work anymore
is it works for SMF 2.0 RC2?
Please update it for SMF 2.0 RC2
Thx
It seemed to install fine on RC2
Hi, Sleepy always has great mods, but I am trying to understand this one. What is the benefit of having this? Does it increase your guest online count?
It identifies the "people" viewing your forum that aren't people, but programs. Some of these programs may be cataloging your forum links for search engines... others might be trying to do something else.
There is a new spider to add to the list.
TopsyLabs
It is a tweet search engine.
THANK YOU
I have had this mod installed for 2 month is it automatic does it call the spiders to my site because i have only about 6 different ones that i submitted myself so i am i missing something.
what the use of spider and they do
Quote from: king999 on August 29, 2010, 02:21:19 AM
what the use of spider and they do
spiders track the information in the posts and the respective links and store them in database. its coz of the database collected/made by the spiders we get the search results in search engines.
really this is gret
I just installed this mod and I dont understand what it is doing exactly.
I still can see X guests X members are online, no spiders/bots are shown
Quote from: thesikaleon on February 26, 2011, 03:21:54 PM
I just installed this mod and I dont understand what it is doing exactly.
I still can see X guests X members are online, no spiders/bots are shown
Did you turn on search engine tracking?
Quote from: Arantor on February 26, 2011, 03:41:32 PM
Quote from: thesikaleon on February 26, 2011, 03:21:54 PM
I just installed this mod and I dont understand what it is doing exactly.
I still can see X guests X members are online, no spiders/bots are shown
Did you turn on search engine tracking?
;D Thought there is not such an option cause of older SMF versions
Works great thanks
which mode show the spider list in who is online?
i am useing smf 2.0 RC3
now i find it.
It only show everyone, members only or guest only but not find spiders.
tell me how i find spider
Ok, I'm going to merge this topic with that modifications support topic.
Quote from: Arantor on February 26, 2011, 03:41:32 PM
Quote from: thesikaleon on February 26, 2011, 03:21:54 PM
I just installed this mod and I dont understand what it is doing exactly.
I still can see X guests X members are online, no spiders/bots are shown
Did you turn on search engine tracking?
How i trun on search engine tracking.
I am useing smf 2.0 RC3
well then install in rc5 that but I have to do something?
Hmm not sure if I can post IPs here or not, if so let me know and I will add an attachment or something..
Anyways, not sure how to find out if an IP showing as a Guest is a bot but as no one knows about my forum I am relatively sure most of these are bots lol, I also have your mod installed so these might be some you have not detected yet.
Also I did a whois on one and it said "TOPSY-1"
Guest (74.112.131.133) 12:25:44 am Viewing the topic Ruins House - by Unknown.
Guest (74.112.131.126) 12:25:35 am Viewing the topic Salisbury - by Unknown.
Guest (74.112.131.128) 12:25:25 am Viewing the topic Sandstone Farm - by Unknown.
Guest (74.112.131.127) 12:25:19 am Viewing the topic Town Pond - by Unknown.
Guest (74.112.131.148) 12:25:16 am Viewing the topic Sandstone Farm - by Unknown.
Guest (207.46.92.16) 12:22:41 am Viewing the topic Town Pond - by Unknown.
Guest (50.16.247.121) 12:22:41 am Viewing the topic Town Pond - by Unknown.
Guest (50.16.239.113) 12:22:11 am Viewing the topic Ruins House - by Unknown.
Guest (184.73.241.30) 12:22:09 am Viewing the topic Salisbury - by Unknown.
Guest (65.52.4.133) 12:21:57 am Viewing the topic Ruins House - by Unknown.
Guest (65.52.17.254) 12:21:55 am Viewing the topic Ruins House - by Unknown.
Guest (173.255.252.37) 12:21:47 am Viewing the topic Sandstone Farm - by Unknown.
Guest (184.72.5.222) 12:21:41 am Viewing the topic Town Pond - by Unknown.
Guest (184.72.24.234) 12:21:41 am Viewing the topic Sandstone Farm - by Unknown.
Guest (204.236.169.67) 12:21:41 am Viewing the topic Salisbury - by Unknown.
Guest (184.72.20.245) 12:21:41 am Viewing the topic Ruins House - by Unknown.
Guest (50.18.121.47) 12:21:40 am Viewing the topic Town Pond - by Unknown.
Guest (184.72.47.71) 12:21:39 am Viewing the topic Town Pond - by Unknown.
Guest (204.236.254.109) 12:21:39 am Viewing the topic Sandstone Farm - by Unknown.
Guest (199.59.149.25) 12:21:39 am Viewing the topic Town Pond - by Unknown.
Guest (199.59.149.55) 12:21:39 am Viewing the topic Town Pond - by Unknown.
Guest (199.59.149.12) 12:21:38 am Viewing the topic Sandstone Farm - by Unknown.
Guest (204.236.186.203) 12:21:38 am Viewing the topic Town Pond - by Unknown.
Guest (50.18.121.55) 12:21:37 am Viewing the topic Ruins House - by Unknown.
Guest (199.59.149.42) 12:21:37 am Viewing the topic Salisbury - by Unknown.
Guest (199.59.149.46) 12:21:37 am Viewing the topic Ruins House - by Unknown.
Guest (50.18.121.55) 12:21:37 am Viewing the topic Ruins House - by Unknown.
Guest (184.72.0.132) 12:21:36 am Viewing the topic Sandstone Farm - by Unknown.
Guest (204.236.181.128) 12:21:36 am Viewing the topic Ruins House - by Unknown.
Guest (199.59.149.38) 12:21:36 am Viewing the topic Ruins House - by Unknown.
Guest (184.72.47.46) 12:21:36 am Viewing the topic Ruins House - by Unknown.
Thank you for the wonderful mod! I thought I'd let everybody know that when I updated my forum from 2.0 RC5 to 2.0 Gold, I had to reinstall all my mods. This mod installed without issues and without emulation. Hope this info helps someone!
Here's a list of the latest spiders I have found crawling my site recently along with their useragent:
~Yandex:
YandexBot — main indexing robot
YandexImages — Yandex.Image indexer
YandexVideo — Yandex.Video indexer
YandexMedia — robot indexing multimedia data
YandexBlogs — blog search robot, indexing post comments
YandexFavicons — favicon indexing robot
YandexWebmaster — a robot that has been directed to a page through the «Add Url» or services «Yandex.Webmaster»;
YandexPagechecker — a robot that validates the micro markup of a page using the «Micro markup validator» form
YandexImageResizer — mobile services robot
YandexDirect — robot indexing pages of sites belonging to the Yandex Advertising Network
YandexDirect — Yandex.Direct robot. This checks the accuracy of an advertised link before moderation
YandexMetrika — Yandex.Metrica robot
YandexNews — Yandex.News robot
YandexCatalog — Yandex.Catalog robot. If a site is offline for several days, it is removed from Catalog. As soon as the site comes online, it will automatically begin to appear in Catalog again
YandexAntivirus — an antivirus robot that checks websites for the presence of malicious code
YandexZakladki — a robot used to verify the availability of pages added to Yandex.Bookmarks
~80 Legs:
008
~Dotbot:
Ezooms
~Bing:
bingbot
These Need updating / changing:
~Yahoo:
Yahoo! Slurp
~Majestic12:
MJ12bot
~Discobot:
discobot (has to be lowercase)
Found Some more:
~TalkTalk:
Trident
~Solomono:
SolomonoBot
~123People:
123peoplebot
~Ahrefs:
AhrefsBot
Is this still being updated?
I am no longer maintaining the mod. (Re those who emailed me) Sleepy has the mod now. Whether he updates it or not for the changes to spiders is upto him.
Thank you
Hello,
I inserted an actual list of 631 more spiders. After some time its good to remove all not needed spiders.
//631 more bots...
array('VoilaBot', 'Voila'),
array('searchdnabot', 'SearchDNA'),
array('kalooga/KaloogaBot', 'KaloogaBot'),
array('iajaBot', 'IajaBot'),
array('Girafabot', 'Girafa'),
array('Gigabot', 'Giga Blast'),
array('ExaBot', 'ExaLead Beta'),
array('Exabot', 'Exabot'),
array('EnaBot', 'EnaBall'),
array('DNAbot/1.0', 'DNAbot'),
array('facebookexternalhit', 'Facebook'),
array('Googlebot-Image', 'Google Images'),
array('ichiro', 'Ichiro Mobile'),
array('008/0.83', '80legs'),
array('urlfan-bot', '://URLFAN'),
array('abot', 'A Bot'),
array('ABACHOBot', 'ABACHOBot'),
array('ABCdatos', 'ABCdatos BotLink'),
array('Aboundex', 'Aboundex'),
array('AboutUsBot', 'AboutUs:Bot'),
array('Accelatech RSSCrawler', 'Accelatech RSS'),
array('Accoona-AI-Agent', 'Accoona'),
array('aconon Index', 'Aconon Index'),
array('AcoonBot', 'Acoon'),
array('AddThis', 'AddThis'),
array('Ahoy!', 'Ahoy!'),
array('AhrefsBot', 'AhrefsBot'),
array('AideRSS', 'AideRSS (PostRank.com)'),
array('ia_archiver', 'Alexa'),
array('bitlybot', 'Alexa Bitlybot'),
array('AlkalineBOT', 'Alkaline'),
array('scooter', 'AltaVista'),
array('crawler/3.0.0', 'Amazon AWS Cloud Based'),
array('JS-Kit URL Resolver', 'Amazon AWS Cloud Based'),
array('EMC Spider', 'Ananzi'),
array('Anthill', 'Anthill'),
array('Synapse', 'Apache Synapse ESB'),
array('Robot/v1.34', 'Apnoti Search Robot'),
array('Aport', 'Aport'),
array('AppleSyndication', 'Apple RSS'),
array('Arachnophilia', 'Arachnophilia'),
array('Araneo', 'Araneo'),
array('ArchitextSpider', 'Architext'),
array('arks/1.0', 'Arks'),
array('T312461', 'Artabus'),
array('Ask Jeeves', 'Ask.com'),
array('ASpider', 'ASpider'),
array('ATN_Worldwide', 'ATN Worldwide'),
array('Atomz', 'Atomz'),
array('atraxbot', 'Atrax Solutions'),
array('Attentio', 'Attentio'),
array('attributor/1.13.2', 'Attributor'),
array('AURESYS', 'Auresys'),
array('bbot', 'B Bot'),
array('BabalooSpider', 'Babaloo'),
array('BackRub', 'BackRub'),
array('BaiduMobaider', 'Baidu'),
array('BaiduImagespider', 'Baidu'),
array('Baiduspider', 'Baidu'),
array('BecomeBot', 'Become'),
array('Begun', 'Begun Robot Crawler'),
array('BejiBot', 'BejiBot'),
array('Big Brother', 'Big Brother'),
array('BigBoardsUpdater', 'Big-Boards.com'),
array('BigmirSpider', 'Bigmir'),
array('bingbot', 'Bing'),
array('Birubot', 'Birubot'),
array('Bitacle bot', 'Bitacle'),
array('Biz360 Spider', 'Biz 360'),
array('Bjaaland', 'Bjaaland'),
array('BlackWidow', 'Black Widow'),
array('BlogCrawler by Xango', 'BlogCrawler'),
array('blogdb', 'BlogDB RSS'),
array('blog search engine by BlogFan.ORG', 'BlogFan'),
array('Bloglines', 'Bloglines RSS'),
array('BlogPulse (ISSpider-3.0)', 'BlogPulse RSS'),
array('BlogSearch', 'BlogSearch RSS'),
array('BlogsNowBot', 'BlogsNow'),
array('BlogStreetBot', 'BlogStreet RSS'),
array('BoardTracker', 'Board Tracker'),
array('BoardPulse', 'BoardPulse'),
array('BoardReader', 'BoardReader'),
array('BoardViewer', 'BoardViewer'),
array('boitho.com-robot', 'Boitho'),
array('boitho.com-dc', 'Boitho Web Crawler'),
array('borg-bot', 'Borg'),
array('BotOnParade', 'BotOnParade'),
array('BOTW Spider', 'BotW'),
array('BDFetch', 'BrandProtect'),
array('BSpider', 'BSpider'),
array('Bulkfeeds', 'Bulkfeeds'),
array('Butterfly', 'Butterfly Topsy Crawler'),
array('CACTVS Chemistry Spider', 'CACTVS Chemistry'),
array('Calif', 'Calif'),
array('CaRP/3.6Evolution', 'CaRP RSS'),
array('CatchBot', 'Catch'),
array('mxbot', 'Chainn'),
array('ChangeDetection', 'ChangeDetection'),
array('Charlotte', 'Charlotte'),
array('Checkbot', 'Checkbot'),
array('ChitikaBot', 'ChitikaBot'),
array('ChristCrawler.com', 'Christ Crawler'),
array('www.cienciaficcion.net', 'cIeNcIaFiCcIoN'),
array('CipinetBot', 'Cipinet'),
array('CJNetworkQuality', 'CJ'),
array('YesupBot', 'Clicksor.com'),
array('CligooRobot', 'Cligoo'),
array('CLX.ru Bot', 'CLX.ru'),
array('CMC/0.01', 'CMC/0.01'),
array('ColdFusion', 'ColdFusion'),
array('combine', 'Combine System'),
array('Crawler ([email protected])', 'Comet Systems'),
array('CCBot', 'CommonCrawl'),
array('COMODOspider', 'COMODO'),
array('ComputingSite Robi', 'ComputingSite Robi'),
array('conceptbot', 'ConceptBot'),
array('Cooby.de Crawler', 'Cooby'),
array('CoolBot', 'CoolBot'),
array('Covario', 'Covario'),
array('twiceler.', 'Cuil'),
array('Twiceler-0.9', 'Cuil'),
array('Cusco', 'Cusco'),
array('CyberSpyder', 'CyberSpyder'),
array('daumoa', 'Daum'),
array('daypopbot', 'Daypop RSS'),
array('DeadLinkCheck', 'Dead Link Check'),
array('Deepnet Explorer', 'Deepnet Explorer'),
array('DesertRealm.com', 'Desert Realm'),
array('Deweb', 'Deweb'),
array('Die Blinde Kuh', 'Die Blinde Kuh'),
array('dienstspider', 'Dienst'),
array('Diffbot', 'Diffbot'),
array('Digger/1.0 JDK/1.3.0', 'Digger'),
array('Digimarc WebReader', 'Digimarc MarcSpider'),
array('Digimarc CGIReader', 'Digimarc Marcspider/CGI'),
array('DIIbot', 'Digital Integrity Robot'),
array('grabber', 'Direct Hit Grabber'),
array('discobot', 'Discobot'),
array('DLE_Spider', 'DLE'),
array('Dolphin', 'Dolphin'),
array('Domnutch-Bot', 'Domnutch-Bot'),
array('DotBot', 'DotBot'),
array('DotBot/1.1', 'dotnetdotcom.org'),
array('DotTK SiteCheck', 'DotTK SiteCheck'),
array('DragonBot/1.0 libwww/5.0', 'DragonBot'),
array('Drupal', 'Drupal'),
array('DWCP/2.0', '<![CDATA[DWCP (Dridus Web Cataloging Project)]]>'),
array('e-SocietyRobot', 'e-Society'),
array('EbiNess/0.01a', 'EbiNess'),
array('dragonfly([email protected])', 'eBingBong'),
array('edgeio-retriever', 'Edgeio'),
array('EIT-Link-Verifier-Robot/0.2', 'EIT Link Verifier Robot'),
array('elfinbot', 'Elfin Bot'),
array('abby', 'Ellerdale Project'),
array('Emacs-w3/v[0-9\.]+', 'Emacs-w3 Search Engine'),
array('Embedly', 'Embedly'),
array('envolk', 'envolk'),
array('ESISmartSpider', 'ESI Smart'),
array('esther', 'Esther'),
array('eSyndiCat Bot', 'eSyndiCat Bot'),
array('EuripBot', 'EuripBot'),
array('Eurobot/1.1', 'Eurobot'),
array('EventGuruBot', 'EventGuruBot'),
array('Evliya Celebi', 'Evliya Celebi'),
array('exactseek-pagereaper', 'Exact Seek'),
array('ExactSeek_Spider', 'Exact Seek'),
array('NG/2.0', 'ExaLead'),
array('Ezooms', 'Ezooms'),
array('Facebook', 'Facebook share follower'),
array('factbot', 'FactBites'),
array('fast-webcrawler', 'FAST / AlltheWeb'),
array('FastCrawler', 'FastCrawler'),
array('Feed24.com', 'Feed 24 RSS'),
array('FeedBlitz', 'Feed Blitz'),
array('UniversalFeedParser', 'Feed Parser'),
array('FeedBurner', 'FeedBurner RSS'),
array('FeedHub MetaDataFetcher', 'FeedHub'),
array('Feedster Crawler', 'Feedster Inc.'),
array('Feedtrace-bot', 'Feedtrace'),
array('FeedValidator/', 'FeedValidator'),
array('FEHLSTART Superspider', 'Fehlstart'),
array('FelixIDE', 'Felix IDE'),
array('ESIRover', 'FetchRover'),
array('fido', 'Fido'),
array('findlinks', 'FindLinks'),
array('FindoryBot', 'Findroy'),
array('Firebat', 'Firebat'),
array('Fish-Search-Robot', 'Fish Search'),
array('Mozilla/4.0 (compatible: FDSE robot)', 'Fluid Dynamics'),
array('FollowSite Bot', 'FollowSite Bot'),
array('libwww-perl/5.810', 'Forexitalia'),
array('fouineur.9bit.qc.ca', 'Fouineur'),
array('Freecrawl', 'Freecrawl'),
array('FreeWebMonitoring', 'FreeWebMonitoring'),
array('FunnelWeb', 'FunnelWeb'),
array('GaisBot', 'Gais'),
array('gamekitbot', 'Gamekit'),
array('gamma', 'gammaSpider'),
array('GarlikCrawler', 'Garlik Crawler'),
array('gazz', 'Gazz'),
array('gcreep', 'GCreep'),
array('genieBot', 'Genie Bot'),
array('GeoHasher', 'GeoHasher'),
array('geourl', 'GeoURL'),
array('GetterroboPlus', 'GetterroboPlus Puu'),
array('GetURL.rexx', 'GetURL'),
array('GingerCrawler', 'GingerCrawler'),
array('UnwindFetchor', 'GNIP'),
array('Goku', 'Goku'),
array('Golem', 'Golem'),
array('gonzo', 'Gonzo'),
array('gooblogsearch', 'Goo Blog Search'),
array('Googlebot', 'Google'),
array('Mediapartners-Google', 'Google AdSense'),
array('Adsbot-Google', 'Google Adwords'),
array('AppEngine-Google', 'Google AppEngine'),
array('FeedFetcher-Google', 'Google FeedFetcher'),
array('kw-lp-suggest', 'Google LP Keyword Checker Bot'),
array('Googlebot-Mobile', 'Google Mobile'),
array('PageFetcher-Google-CoOp', 'Google PageFetcher CoOp'),
array('Google-Sitemaps/1.0', 'Google Sitemaps'),
array('Googlebot-Video', 'Google Video'),
array('Google Web Preview', 'Google Web Preview'),
array('Google Wireless Transcoder', 'Google Wireless Transcoder'),
array('GosoSpider', 'Goso'),
array('Gpostbot', 'Gpost'),
array('griffon', 'Griffon'),
array('Gromit', 'Gromit'),
array('http://grub.org', 'Grub Client'),
array('Gulper Web Bot', 'Gulper'),
array('GurujiBot', 'Guruji'),
array('H?m?h?kki', 'H?m?h?kki'),
array('havIndex', 'HavIndex'),
array('HeinrichderMiragoRobot', 'Heinrichder Mirago'),
array('HenryTheMiragoRobot', 'Henry The Mirago Robot'),
array('archive.org_bot', 'Heritrix'),
array('HKU WWW Robot', 'HKU WWW Octopus'),
array('HolyCowDude', 'HolyCowDude RSS'),
array('Hometown', 'Hometown'),
array('HostTracker.com/1.0', 'HostTracker'),
array('htdig', 'ht://Dig'),
array('HTMLgobble', 'HTML Gobble'),
array('AITCSRobot', 'HTML Index'),
array('HuaweiSymantecSpider', 'Huawei Symantec'),
array('I Robot', 'I, Robot'),
array('http://www.almaden.ibm.com/cs/crawler', 'IBM Almaden'),
array('IBM_Planetwide', 'IBM Planetwide'),
array('+http://www.icerocket.com/', 'IceRocket'),
array('ichiro', 'Ichiro'),
array('igde', 'igdeSpyder'),
array('IlTrovatore-Setaccio', 'IlTrovatore Setaccio'),
array('Mozilla 3.01 PBWF (Win95)', 'Imagelock'),
array('IncyWincy', 'IncyWincy'),
array('Indy Library', 'Indy Library'),
array('Informant', 'Informant'),
array('InfoSeek Robot', 'InfoSeek Robot 1.0'),
array('Infoseek Sidewinder', 'Infoseek Sidewinder'),
array('infoSpider', 'infoSpider'),
array('INGRID', 'Ingrid'),
array('slurp@inktomi', 'Inktomi'),
array('Insitor', 'Insitor'),
array('inspectorwww', 'Inspector Web'),
array('IAGENT', 'IntelliAgent'),
array('Intelliseek', 'Intelliseek'),
array('Internet Cruiser Robot', 'Internet Cruiser'),
array('SCHOOLCARE; SV1; InfoPath.1', 'Internet for learning'),
array('InternetLinkAgent', 'Internet Link Agent'),
array('3GSE bot', 'Internet Research Institute UK'),
array('internetseer', 'Internet Seer'),
array('sharp-info-agent', 'Internet Shinchakubin'),
array('Pogodak', 'Interseek'),
array('Iron33', 'Iron33'),
array('IsraeliSearch', 'Israeli Search'),
array('itchBot', 'Itch'),
array('JavaBee', 'JavaBee'),
array('JBot', 'JBot'),
array('JCrawler', 'JCrawler'),
array('JetBot', 'JetEye'),
array('JoBo', 'JoBo'),
array('Jobot', 'Jobot'),
array('JoeBot', 'JoeBot'),
array('JSpider', 'JSpider'),
array('jumpstation', 'JumpStation'),
array('Jyxobot', 'Jyxo'),
array('image.kapsi.net', 'Kapsi Images'),
array('Katipo', 'Katipo'),
array('KDD-Explorer', 'KDD Explorer'),
array('KIT-Fireball', 'KIT Fireball'),
array('kmbot', 'knowmore'),
array('KO_Yappo_Robot', 'KO Yappo'),
array('LabelGrab', 'LabelGrabber'),
array('larbin', 'Larbin'),
array('LeapTag', 'LeapTag News Reader'),
array('LexxeBot', 'Lexxe'),
array('lwp-trivial', 'libwww-perl'),
array('linkalarm', 'Link Alarm'),
array('Linkidator', 'Link Validator'),
array('LinkedInBot', 'LinkedIn'),
array('LinkScan Server', 'LinkScan'),
array('LinkWalker', 'LinkWalker'),
array('livedoorCheckers/', 'Livedoor Checkers'),
array('Lockon', 'Lockon'),
array('logo.gif crawler', 'logo.gif'),
array('LuminateBot', 'LuminateBot'),
array('Lycos', 'Lycos'),
array('Apple-PubSub/59', 'Mac OS X RSS'),
array('Magpie', 'Magpie'),
array('magpie-crawler', 'Magpie Crawler'),
array('Mail.Ru', 'Mail.Ru'),
array('MJ12bot', 'Majestics MJ12bot'),
array('Mammoth', 'Mammoth'),
array('Marvin', 'Marvin'),
array('marvin/infoseek', 'marvin/infoseek'),
array('M/3.8', 'Mattie'),
array('MediaFox', 'MediaFox'),
array('mercator', 'Mercator'),
array('MerzScope', 'MerzScope'),
array('METASpider', 'Meta'),
array('Metaeuro', 'MetaEuro'),
array('MetaGer-LinkChecker', 'MetaGer'),
array('MetaURI', 'MetaURI'),
array('MSR-ISRCCrawler', 'Microsoft Research'),
array('MindCrawler', 'MindCrawler'),
array('Miva', 'Miva'),
array('MLBot', 'MLBot'),
array('UdmSearch', 'MNO GoSearch'),
array('mnoGoSearch', 'mnoGoSearch'),
array('moget', 'Moget'),
array('MOMspider', 'MOM'),
array('Monster', 'Monster'),
array('Moreoverbot', 'Moreover'),
array('Mp3Bot', 'Mp3Realm'),
array('msnbot', 'MSNBot'),
array('msnbot-media', 'MSNBot (Media Search)'),
array('msnbot-mobile', 'MSNBot (Mobile)'),
array('msnbot-newsblogs', 'MSNBot (News Search)'),
array('msnbot-products', 'MSNBot (Product Search)'),
array('MSRBOT', 'MSRBot'),
array('MuscatFerret', 'Muscat Ferret'),
array('MwdSearch', 'Mwd.Search'),
array('Najdi.si', 'Najdi.si'),
array('NPBot', 'NameProtect'),
array('NaverBot', 'NaverBot'),
array('NEC-MeshExplorer', 'NEC MeshExplorer'),
array('Nederland.zoek', 'Nederland.zoek'),
array('NerdByNature.Bot', 'NerdByNature'),
array('NetCarta CyberPilot Pro', 'NetCarta WebMap'),
array('Netcraft', 'Netcraft Web Server Survey'),
array('NetMechanic', 'NetMechanic'),
array('NetNewsWire', 'NetNewsWire RSS'),
array('NetScoop', 'NetScoop'),
array('NIF', 'News is Free RSS'),
array('newscan-online', 'Newscan Online'),
array('(X11; compatible; [email protected]; HTTPClient 3.1', 'Newstin'),
array('NextGenSearchBot 1', 'NextGen Search Bot'),
array('NHSEWalker', 'NHSE Web Forager'),
array('NimbleCrawler', 'NimbleCrawler'),
array('NjuiceBot', 'NjuiceBot'),
array('Nomad', 'Nomad'),
array('Norbert the Spider', 'Norbert'),
array('Gulliver', 'Northern Light'),
array('Nutch', 'Nutch'),
array('explorersearch', 'NZ Explorer'),
array('Occam', 'Occam'),
array('Ocelli', 'Ocelli'),
array('omgilibot', 'Omgili'),
array('omgilibot/0.3 +http://www.omgili.com/Crawler.html', 'omgilibot'),
array('OneRiot', 'OneRiot'),
array('Me.dium', 'OneRiot.com'),
array('LargeSmall', 'OneSpot'),
array('Online24-Bot', 'Online24-Bot'),
array('OOZBOT', 'OOZBOT'),
array('Openbot', 'Openfind'),
array('Openfind', 'Openfind data gatherer'),
array('OpenISearch', 'OpenISearch'),
array('Orbsearch', 'Orb Search'),
array('OWPBot', 'OWPBot'),
array('PackRat', 'Pack Rat'),
array('PageBoy', 'PageBoy'),
array('panscient.com', 'Panscient'),
array('ParaSite', 'ParaSite'),
array('ParchBot', 'ParchmentHill'),
array('Patric', 'Patric'),
array('PEGASUS', 'Pegasus'),
array('PerlCrawler/1.0 Xavatoria/2.0', 'PerlCrawler 1.0'),
array('PGP-KA', 'PGP Key Agent'),
array('Duppies', 'Phantom'),
array('phpdig', 'PhpDig'),
array('psbot/0.1 (+http://www.picsearch.com/bot.html) (51dc65875976ac434c09274f7e46dec6)', 'Picsearch'),
array('PiltdownMan', 'Piltdown Man'),
array('Pimptrains robot', 'Pimptrain'),
array('pingalink', 'Ping A Link'),
array('pingdom.com_bot', 'Pingdom.com Bot'),
array('Pioneer', 'Pioneer'),
array('PluckFeedCrawler', 'Pluck'),
array('Plukkie', 'Plukkie'),
array('PlumtreeWebAccessor', 'Plumtree Web Accessor'),
array('PodNova', 'Pod Nova'),
array('Pompos', 'Pompos'),
array('Poppi', 'Poppi'),
array('gestaltIconoclast', 'Popular Iconoclast'),
array('PortalBSpider', 'Portal B'),
array('PortalJuice.com', 'Portal Juice'),
array('PostRank', 'PostRank'),
array('ProCogBot', 'ProCog Bot'),
array('psbot', 'PSBot'),
array('PycURL', 'PycURL'),
array('Qango.com Web Directory', 'Qango'),
array('R6_FeedFetcher', 'Radian6'),
array('R6_CommentReader', 'Radian6'),
array('R6_CommentReader', 'Radian6 Comment Reader'),
array('R6_FeedFetcher', 'Radian6 FeedFetcher'),
array('R6_FeedFetcher', 'Radian6 FeedFetcher'),
array('StackRambler', 'Rambler'),
array('Raven', 'Raven Search'),
array('RixBot', 'REBOL IndeXer'),
array('rdfbot', 'Rediff'),
array('Resume Robot', 'Resume Robot'),
array('Road Runner: ImageScape Robot', 'Road Runner: ImageScape Robot'),
array('RHCS', 'RoadHouse Crawling System'),
array('Robbie', 'Robbie'),
array('RoboCrawl', 'RoboCrawl'),
array('Robofox', 'RoboFox'),
array('Robot du CRIM 1.0a', 'Robot Francoroute'),
array('Robozilla', 'Robozilla'),
array('Roverbot', 'Roverbot'),
array('RSS-SPIDER', 'RSS Feed Seeker'),
array('RuLeS', 'RuLeS'),
array('SafetyNet Robot', 'SafetyNet'),
array('SBIder', 'SBIder RSS'),
array('Scarlett', 'Scarlett'),
array('Scharia', 'Scharia'),
array('Science-Index', 'Science Index'),
array('ScooperBot', 'ScooperBot'),
array('ScoutJet', 'ScoutJet'),
array('Scrubby/3.0', 'Scrubby'),
array('SearchNZ', 'Search NZ'),
array('search17', 'Search17'),
array('searchprocess', 'SearchProcess'),
array('SBSearch', 'Secret Search Engine Labs'),
array('Seekbot', 'Seekbot'),
array('SemrushBot', 'SemrushBot'),
array('SemtoBot', 'SemtoBot'),
array('Senrigan', 'Senrigan'),
array('Sensis Web Crawler', 'Sensis'),
array('spbot', 'SEOprofiler'),
array('ServiceUptime.robot', 'ServiceUptime'),
array('SeznamBot', 'Seznam Fulltext Blog'),
array('SG-Scout', 'SG Scout'),
array('Shagseeker', 'ShagSeeker'),
array('ShaiHulud', '<![CDATA[ShaiHulud]]>'),
array('SheenBot', 'SheenBot'),
array('ShopWiki', '<![CDATA[Shopwiki [Bot]]]>'),
array('SimilarPages/Nutch', '<![CDATA[SimilarPages/Nutch [Crawler]]]>'),
array('SimBot/1.0', 'Simmany Robot Ver 1.0'),
array('ssearcher100', 'Site Searcher'),
array('Site Valet', 'Site Valet'),
array('http://www.site-list.net', 'Site-List RSS'),
array('SiteBot', 'SiteBot'),
array('SiteTech-Rover', 'SiteTech-Rover'),
array('SiteUptime.com', 'SiteUptime'),
array('SiteVibeBot', 'SiteVibeBot'),
array('+SitiDi.net/SitiDiBot/', 'SitiDi'),
array('SkimBot', 'SkimBot'),
array('aWapClient', 'Skymob'),
array('SLCrawler', 'SLCrawler'),
array('Sleek Spider', 'Sleek'),
array('Snapbot', 'Snap Shots'),
array('Snapbot/1.0', 'Snapbot'),
array('SnapPreviewBot', 'SnapPreviewBot'),
array('Snooper', 'Snooper'),
array('socbot', 'SocBot'),
array('Sogou web spider', 'Sogou'),
array('sohu-search', 'Sohu Search'),
array('Solbot', 'Solbot'),
array('Sosospider', 'Soso'),
array('www.entireweb.com/speedy.html', 'Speedy'),
array('Speedy', 'Speedy'),
array('Sphere Scout', 'Sphere'),
array('Sphider2', 'Sphider'),
array('mouse.house', 'Spider Monkey'),
array('SpiderBot', 'SpiderBot'),
array('spiderline', 'Spiderline Crawler'),
array('SpiderMan', 'SpiderMan'),
array('SpiderPig', 'SpiderPig'),
array('SpiderView', 'SpiderView'),
array('Spinn3r', 'Spinn3r'),
array('squadbot', 'SQuADbot'),
array('suke', 'Suke'),
array('suntek', 'Suntek Search Engine'),
array('superbot.com', 'Super.info Search Bot'),
array('Superfeedr', 'Superfeedr'),
array('Synthesio', 'Synthesio'),
array('Szukacz', 'Szukacz'),
array('Black Widow', 'TACH Black Widow'),
array('Tagoobot', 'Tagoo.ru'),
array('tailsweepblogcrawler', 'Tailsweep'),
array('Tarantula', 'Tarantula'),
array('tarspider', 'TarSpider'),
array('dlw3robot', 'Tcl W3 Robot'),
array('TechBOT', 'TechBOT'),
array('Technoratibot', 'Technorati'),
array('Templeton', 'Templeton'),
array('teoma', 'Teoma/Ask Jeeves'),
array('JubiiRobot', 'The Jubii'),
array('NorthStar', 'The NorthStar Robot'),
array('w3index', 'The NWI Robot'),
array('Peregrinator-Mathematics', 'The Peregrinator'),
array('thumbshots-de-Bot', 'Thumbshots'),
array('T-H-U-N-D-E-R-S-T-O-N-E', 'Thunderstone'),
array('TinEye', 'TinEye'),
array('TITAN', 'Titan'),
array('TitIn', 'TitIn'),
array('TLSpider', 'TLSpider'),
array('turnitinbot', 'Turn it in'),
array('slysearch', 'Turn it in slysearch'),
array('TurtleScanner', 'Turtle'),
array('Tweetmeme', 'Tweetmeme.com'),
array('Twiceler', 'Twiceler (Cuill.com)'),
array('Twingbot', 'Twingbot'),
array('Twingly', 'Twingly'),
array('Twitterbot', 'Twitterbot'),
array('Twitturls', 'Twitturls.com'),
array('Python-urllib', 'Twitturls.com (Python-urllib)'),
array('UCSD-Crawler', 'UCSD Crawl'),
array('UMBC-memeta-Bot', 'UMBC RSS'),
array('Unpartisan', 'Unpartisan RSS'),
array('urlck', 'URL Check'),
array('URL Spider Pro', 'URL Spider Pro'),
array('Valkyrie', 'Valkyrie'),
array('ClickSense', 'ValueClick LM'),
array('Mozilla/4.0 (vBSEO; http://www.vbseo.com)', 'vBSEO'),
array('Verticrawl', 'Verticrawl'),
array('Victoria', 'Victoria'),
array('vision-search', 'Vision Search'),
array('Visions Search', 'Visions'),
array('voyager/1.0', 'Voyager'),
array('VWbot_K', 'VWbot'),
array('W3C-checklink', 'W3C'),
array('W3C_CSS_Validator', 'W3C CSS Validator'),
array('W3C_Validator', 'W3C Validator'),
array('Unicorn', '<![CDATA[W3Cs United Validator]]>'),
array('W3M2', 'W3M2'),
array('w3mir', 'W3mir'),
array('w@pspider', 'w@p'),
array('appie', 'Walhello Appie'),
array('CrawlPaper', 'WallPaper'),
array('root', 'Web Core / Roots'),
array('WebMoose', 'Web Moose'),
array('WebAlta', 'WebAlta'),
array('WebAlta Crawler', 'WebAlta'),
array('WebBandit', 'WebBandit'),
array('WebCatcher', 'WebCatcher'),
array('Webclipping', 'Webclipping'),
array('WebCopy', 'WebCopy'),
array('WebFetcher', 'WebFetcher'),
array('weblayers', 'WebLayers'),
array('WebLinker', 'WebLinker'),
array('wlm', 'Weblog Monitor'),
array('WebQuest', 'WebQuest'),
array('WebReaper', 'WebReaper'),
array('[email protected]', 'Webs'),
array('websearchbench', 'WebSearchBench'),
array('WOLP', 'WebStolperer'),
array('webvac', 'WebVac'),
array('webwalk', 'WebWalk'),
array('WebWalker', 'WebWalker'),
array('WebWatch', 'WebWatch'),
array('www.WebWombat.com.au', 'WebWombat'),
array('Wget', 'Wget'),
array('whatUseek_winona', 'What U Seek Winona'),
array('Whitevector Crawler', 'Whitevector Crawler'),
array('www.whoisde.de', 'Whois DE'),
array('SurveyBot', 'Whois Source'),
array('wikiwix', 'Wikiwix'),
array('Hazels Ferret Web hopper', 'Wild Ferret Web Hopper'),
array('Willow Internet Crawler by Twotrees', 'Willow'),
array('Windows-Live-Social-Object-Extractor-Engine', 'Windows Live SOEE'),
array('Windows-RSS-Platform/1.0', 'Windows RSS Platform 1.0'),
array('Windows-RSS-Platform/2.0', 'Windows RSS Platform 2.0'),
array('WinHTTP', 'WinHTTP'),
array('wired-digital-newsbot', 'Wired Digital'),
array('Bilbo', 'Wise-Guys'),
array('Vagabondo', 'Wise-Guys'),
array('zyborg', 'WiseNut'),
array('WordPress', 'WordPress'),
array('woriobot', 'Worio'),
array('OmniExplorer_Bot', 'WorldIndexer'),
array('Project Kolinka Forum Search', 'www.kolinka.com'),
array('WWWC', 'WWWC'),
array('WWWeasel Robot', 'WWWeasel'),
array('wwwster', 'WWWSter'),
array('WWWWanderer', 'WWWWanderer'),
array('TECOMAC-Crawler', 'X-Crawler'),
array('Xenu', 'Xenu Link Sleuth'),
array('XGET', 'XGET'),
array('cosmos', 'XYLEME Robot'),
array('yacybot', 'YaCy'),
array('YahooYSMcm', 'Yahoo Publisher Network'),
array('Yahoo-Blogs', 'Yahoo! Blogs'),
array('YahooFeedSeeker', 'Yahoo! FeedSeeker'),
array('Yahoo-MMCrawler', 'Yahoo! Image Search'),
array('YahooSeeker/M1A1-R2D2', 'Yahoo! Mobile'),
array('Yahoo! Slurp', 'Yahoo! Slurp'),
array('Yahoo-VerticalCrawler', 'Yahoo! Vertical Crawler'),
array('YandexAntivirus', 'Yandex Antivirus'),
array('YandexBlog', 'Yandex Blog'),
array('YandexBot', 'Yandex Bot'),
array('YandexCatalog', 'Yandex Catalog'),
array('YandexDirect', 'Yandex Direct'),
array('YandexFavicon', 'Yandex Favicon'),
array('YandexImageResizer', 'Yandex ImageResizer'),
array('YandexImages', 'Yandex Images'),
array('YandexMedia', 'Yandex Media'),
array('YandexMetrika', 'Yandex Metrika'),
array('YandexNews', 'Yandex News'),
array('YandexPagechecker', 'Yandex Pagechecker'),
array('YandexVideo', 'Yandex Video'),
array('YandexWebmaster', 'Yandex Webmaster'),
array('YandexZakladki', 'Yandex Zakladki'),
array('Yanga WorldSearch Bot', 'Yanga'),
array('Yanga WorldSearch Bot', 'Yanga WorldSearch Bot'),
array('YebolBot', 'YebolBot'),
array('yeti', 'Yeti'),
array('Yeti', 'Yeti'),
array('Yeti/1.0', 'Yeti/1.0'),
array('YodaoBot', 'Yodao'),
array('YoudaoBot', 'Youdao'),
array('YRSpider', 'YunRang'),
array('zeus', 'Zeus Internet Marketing'),
array('http://www.zorkk.com', 'Zork RSS'),
Is there anywhere in the admin section to view Robot/crawler settings?
And is their any mod to display them in the online list?
Did you activate Spider tracking?
Go to Adminstartion Center -> Core features -> Search Engine Tracking (activate) and save.
Yes, that is active..
"There are currently no spider log entries."
Is there any way I can attract them?
Thanks
Did you set log level at minimum to standard?
You can create an account at https://www.google.com/webmasters/tools/ and tell them your url / sitemap.
It was set as the lowest option...
I have enabled to show "spiders" on the online list. If a spider is on the site, will the name (i.e. "googlebot") display?
Quote from: kokett on November 12, 2012, 10:36:14 AM
Hello,
I inserted an actual list of 631 more spiders. After some time its good to remove all not needed spiders.
//631 more bots...
array('VoilaBot', 'Voila'),
array('searchdnabot', 'SearchDNA'),
array('kalooga/KaloogaBot', 'KaloogaBot'),
array('iajaBot', 'IajaBot'),
array('Girafabot', 'Girafa'),
array('Gigabot', 'Giga Blast'),
array('ExaBot', 'ExaLead Beta'),
array('Exabot', 'Exabot'),
array('EnaBot', 'EnaBall'),
array('DNAbot/1.0', 'DNAbot'),
array('facebookexternalhit', 'Facebook'),
array('Googlebot-Image', 'Google Images'),
array('ichiro', 'Ichiro Mobile'),
array('008/0.83', '80legs'),
array('urlfan-bot', '://URLFAN'),
array('abot', 'A Bot'),
array('ABACHOBot', 'ABACHOBot'),
array('ABCdatos', 'ABCdatos BotLink'),
array('Aboundex', 'Aboundex'),
array('AboutUsBot', 'AboutUs:Bot'),
array('Accelatech RSSCrawler', 'Accelatech RSS'),
array('Accoona-AI-Agent', 'Accoona'),
array('aconon Index', 'Aconon Index'),
array('AcoonBot', 'Acoon'),
array('AddThis', 'AddThis'),
array('Ahoy!', 'Ahoy!'),
array('AhrefsBot', 'AhrefsBot'),
array('AideRSS', 'AideRSS (PostRank.com)'),
array('ia_archiver', 'Alexa'),
array('bitlybot', 'Alexa Bitlybot'),
array('AlkalineBOT', 'Alkaline'),
array('scooter', 'AltaVista'),
array('crawler/3.0.0', 'Amazon AWS Cloud Based'),
array('JS-Kit URL Resolver', 'Amazon AWS Cloud Based'),
array('EMC Spider', 'Ananzi'),
array('Anthill', 'Anthill'),
array('Synapse', 'Apache Synapse ESB'),
array('Robot/v1.34', 'Apnoti Search Robot'),
array('Aport', 'Aport'),
array('AppleSyndication', 'Apple RSS'),
array('Arachnophilia', 'Arachnophilia'),
array('Araneo', 'Araneo'),
array('ArchitextSpider', 'Architext'),
array('arks/1.0', 'Arks'),
array('T312461', 'Artabus'),
array('Ask Jeeves', 'Ask.com'),
array('ASpider', 'ASpider'),
array('ATN_Worldwide', 'ATN Worldwide'),
array('Atomz', 'Atomz'),
array('atraxbot', 'Atrax Solutions'),
array('Attentio', 'Attentio'),
array('attributor/1.13.2', 'Attributor'),
array('AURESYS', 'Auresys'),
array('bbot', 'B Bot'),
array('BabalooSpider', 'Babaloo'),
array('BackRub', 'BackRub'),
array('BaiduMobaider', 'Baidu'),
array('BaiduImagespider', 'Baidu'),
array('Baiduspider', 'Baidu'),
array('BecomeBot', 'Become'),
array('Begun', 'Begun Robot Crawler'),
array('BejiBot', 'BejiBot'),
array('Big Brother', 'Big Brother'),
array('BigBoardsUpdater', 'Big-Boards.com'),
array('BigmirSpider', 'Bigmir'),
array('bingbot', 'Bing'),
array('Birubot', 'Birubot'),
array('Bitacle bot', 'Bitacle'),
array('Biz360 Spider', 'Biz 360'),
array('Bjaaland', 'Bjaaland'),
array('BlackWidow', 'Black Widow'),
array('BlogCrawler by Xango', 'BlogCrawler'),
array('blogdb', 'BlogDB RSS'),
array('blog search engine by BlogFan.ORG', 'BlogFan'),
array('Bloglines', 'Bloglines RSS'),
array('BlogPulse (ISSpider-3.0)', 'BlogPulse RSS'),
array('BlogSearch', 'BlogSearch RSS'),
array('BlogsNowBot', 'BlogsNow'),
array('BlogStreetBot', 'BlogStreet RSS'),
array('BoardTracker', 'Board Tracker'),
array('BoardPulse', 'BoardPulse'),
array('BoardReader', 'BoardReader'),
array('BoardViewer', 'BoardViewer'),
array('boitho.com-robot', 'Boitho'),
array('boitho.com-dc', 'Boitho Web Crawler'),
array('borg-bot', 'Borg'),
array('BotOnParade', 'BotOnParade'),
array('BOTW Spider', 'BotW'),
array('BDFetch', 'BrandProtect'),
array('BSpider', 'BSpider'),
array('Bulkfeeds', 'Bulkfeeds'),
array('Butterfly', 'Butterfly Topsy Crawler'),
array('CACTVS Chemistry Spider', 'CACTVS Chemistry'),
array('Calif', 'Calif'),
array('CaRP/3.6Evolution', 'CaRP RSS'),
array('CatchBot', 'Catch'),
array('mxbot', 'Chainn'),
array('ChangeDetection', 'ChangeDetection'),
array('Charlotte', 'Charlotte'),
array('Checkbot', 'Checkbot'),
array('ChitikaBot', 'ChitikaBot'),
array('ChristCrawler.com', 'Christ Crawler'),
array('www.cienciaficcion.net', 'cIeNcIaFiCcIoN'),
array('CipinetBot', 'Cipinet'),
array('CJNetworkQuality', 'CJ'),
array('YesupBot', 'Clicksor.com'),
array('CligooRobot', 'Cligoo'),
array('CLX.ru Bot', 'CLX.ru'),
array('CMC/0.01', 'CMC/0.01'),
array('ColdFusion', 'ColdFusion'),
array('combine', 'Combine System'),
array('Crawler ([email protected])', 'Comet Systems'),
array('CCBot', 'CommonCrawl'),
array('COMODOspider', 'COMODO'),
array('ComputingSite Robi', 'ComputingSite Robi'),
array('conceptbot', 'ConceptBot'),
array('Cooby.de Crawler', 'Cooby'),
array('CoolBot', 'CoolBot'),
array('Covario', 'Covario'),
array('twiceler.', 'Cuil'),
array('Twiceler-0.9', 'Cuil'),
array('Cusco', 'Cusco'),
array('CyberSpyder', 'CyberSpyder'),
array('daumoa', 'Daum'),
array('daypopbot', 'Daypop RSS'),
array('DeadLinkCheck', 'Dead Link Check'),
array('Deepnet Explorer', 'Deepnet Explorer'),
array('DesertRealm.com', 'Desert Realm'),
array('Deweb', 'Deweb'),
array('Die Blinde Kuh', 'Die Blinde Kuh'),
array('dienstspider', 'Dienst'),
array('Diffbot', 'Diffbot'),
array('Digger/1.0 JDK/1.3.0', 'Digger'),
array('Digimarc WebReader', 'Digimarc MarcSpider'),
array('Digimarc CGIReader', 'Digimarc Marcspider/CGI'),
array('DIIbot', 'Digital Integrity Robot'),
array('grabber', 'Direct Hit Grabber'),
array('discobot', 'Discobot'),
array('DLE_Spider', 'DLE'),
array('Dolphin', 'Dolphin'),
array('Domnutch-Bot', 'Domnutch-Bot'),
array('DotBot', 'DotBot'),
array('DotBot/1.1', 'dotnetdotcom.org'),
array('DotTK SiteCheck', 'DotTK SiteCheck'),
array('DragonBot/1.0 libwww/5.0', 'DragonBot'),
array('Drupal', 'Drupal'),
array('DWCP/2.0', '<![CDATA[DWCP (Dridus Web Cataloging Project)]]>'),
array('e-SocietyRobot', 'e-Society'),
array('EbiNess/0.01a', 'EbiNess'),
array('dragonfly([email protected])', 'eBingBong'),
array('edgeio-retriever', 'Edgeio'),
array('EIT-Link-Verifier-Robot/0.2', 'EIT Link Verifier Robot'),
array('elfinbot', 'Elfin Bot'),
array('abby', 'Ellerdale Project'),
array('Emacs-w3/v[0-9\.]+', 'Emacs-w3 Search Engine'),
array('Embedly', 'Embedly'),
array('envolk', 'envolk'),
array('ESISmartSpider', 'ESI Smart'),
array('esther', 'Esther'),
array('eSyndiCat Bot', 'eSyndiCat Bot'),
array('EuripBot', 'EuripBot'),
array('Eurobot/1.1', 'Eurobot'),
array('EventGuruBot', 'EventGuruBot'),
array('Evliya Celebi', 'Evliya Celebi'),
array('exactseek-pagereaper', 'Exact Seek'),
array('ExactSeek_Spider', 'Exact Seek'),
array('NG/2.0', 'ExaLead'),
array('Ezooms', 'Ezooms'),
array('Facebook', 'Facebook share follower'),
array('factbot', 'FactBites'),
array('fast-webcrawler', 'FAST / AlltheWeb'),
array('FastCrawler', 'FastCrawler'),
array('Feed24.com', 'Feed 24 RSS'),
array('FeedBlitz', 'Feed Blitz'),
array('UniversalFeedParser', 'Feed Parser'),
array('FeedBurner', 'FeedBurner RSS'),
array('FeedHub MetaDataFetcher', 'FeedHub'),
array('Feedster Crawler', 'Feedster Inc.'),
array('Feedtrace-bot', 'Feedtrace'),
array('FeedValidator/', 'FeedValidator'),
array('FEHLSTART Superspider', 'Fehlstart'),
array('FelixIDE', 'Felix IDE'),
array('ESIRover', 'FetchRover'),
array('fido', 'Fido'),
array('findlinks', 'FindLinks'),
array('FindoryBot', 'Findroy'),
array('Firebat', 'Firebat'),
array('Fish-Search-Robot', 'Fish Search'),
array('Mozilla/4.0 (compatible: FDSE robot)', 'Fluid Dynamics'),
array('FollowSite Bot', 'FollowSite Bot'),
array('libwww-perl/5.810', 'Forexitalia'),
array('fouineur.9bit.qc.ca', 'Fouineur'),
array('Freecrawl', 'Freecrawl'),
array('FreeWebMonitoring', 'FreeWebMonitoring'),
array('FunnelWeb', 'FunnelWeb'),
array('GaisBot', 'Gais'),
array('gamekitbot', 'Gamekit'),
array('gamma', 'gammaSpider'),
array('GarlikCrawler', 'Garlik Crawler'),
array('gazz', 'Gazz'),
array('gcreep', 'GCreep'),
array('genieBot', 'Genie Bot'),
array('GeoHasher', 'GeoHasher'),
array('geourl', 'GeoURL'),
array('GetterroboPlus', 'GetterroboPlus Puu'),
array('GetURL.rexx', 'GetURL'),
array('GingerCrawler', 'GingerCrawler'),
array('UnwindFetchor', 'GNIP'),
array('Goku', 'Goku'),
array('Golem', 'Golem'),
array('gonzo', 'Gonzo'),
array('gooblogsearch', 'Goo Blog Search'),
array('Googlebot', 'Google'),
array('Mediapartners-Google', 'Google AdSense'),
array('Adsbot-Google', 'Google Adwords'),
array('AppEngine-Google', 'Google AppEngine'),
array('FeedFetcher-Google', 'Google FeedFetcher'),
array('kw-lp-suggest', 'Google LP Keyword Checker Bot'),
array('Googlebot-Mobile', 'Google Mobile'),
array('PageFetcher-Google-CoOp', 'Google PageFetcher CoOp'),
array('Google-Sitemaps/1.0', 'Google Sitemaps'),
array('Googlebot-Video', 'Google Video'),
array('Google Web Preview', 'Google Web Preview'),
array('Google Wireless Transcoder', 'Google Wireless Transcoder'),
array('GosoSpider', 'Goso'),
array('Gpostbot', 'Gpost'),
array('griffon', 'Griffon'),
array('Gromit', 'Gromit'),
array('http://grub.org', 'Grub Client'),
array('Gulper Web Bot', 'Gulper'),
array('GurujiBot', 'Guruji'),
array('H?m?h?kki', 'H?m?h?kki'),
array('havIndex', 'HavIndex'),
array('HeinrichderMiragoRobot', 'Heinrichder Mirago'),
array('HenryTheMiragoRobot', 'Henry The Mirago Robot'),
array('archive.org_bot', 'Heritrix'),
array('HKU WWW Robot', 'HKU WWW Octopus'),
array('HolyCowDude', 'HolyCowDude RSS'),
array('Hometown', 'Hometown'),
array('HostTracker.com/1.0', 'HostTracker'),
array('htdig', 'ht://Dig'),
array('HTMLgobble', 'HTML Gobble'),
array('AITCSRobot', 'HTML Index'),
array('HuaweiSymantecSpider', 'Huawei Symantec'),
array('I Robot', 'I, Robot'),
array('http://www.almaden.ibm.com/cs/crawler', 'IBM Almaden'),
array('IBM_Planetwide', 'IBM Planetwide'),
array('+http://www.icerocket.com/', 'IceRocket'),
array('ichiro', 'Ichiro'),
array('igde', 'igdeSpyder'),
array('IlTrovatore-Setaccio', 'IlTrovatore Setaccio'),
array('Mozilla 3.01 PBWF (Win95)', 'Imagelock'),
array('IncyWincy', 'IncyWincy'),
array('Indy Library', 'Indy Library'),
array('Informant', 'Informant'),
array('InfoSeek Robot', 'InfoSeek Robot 1.0'),
array('Infoseek Sidewinder', 'Infoseek Sidewinder'),
array('infoSpider', 'infoSpider'),
array('INGRID', 'Ingrid'),
array('slurp@inktomi', 'Inktomi'),
array('Insitor', 'Insitor'),
array('inspectorwww', 'Inspector Web'),
array('IAGENT', 'IntelliAgent'),
array('Intelliseek', 'Intelliseek'),
array('Internet Cruiser Robot', 'Internet Cruiser'),
array('SCHOOLCARE; SV1; InfoPath.1', 'Internet for learning'),
array('InternetLinkAgent', 'Internet Link Agent'),
array('3GSE bot', 'Internet Research Institute UK'),
array('internetseer', 'Internet Seer'),
array('sharp-info-agent', 'Internet Shinchakubin'),
array('Pogodak', 'Interseek'),
array('Iron33', 'Iron33'),
array('IsraeliSearch', 'Israeli Search'),
array('itchBot', 'Itch'),
array('JavaBee', 'JavaBee'),
array('JBot', 'JBot'),
array('JCrawler', 'JCrawler'),
array('JetBot', 'JetEye'),
array('JoBo', 'JoBo'),
array('Jobot', 'Jobot'),
array('JoeBot', 'JoeBot'),
array('JSpider', 'JSpider'),
array('jumpstation', 'JumpStation'),
array('Jyxobot', 'Jyxo'),
array('image.kapsi.net', 'Kapsi Images'),
array('Katipo', 'Katipo'),
array('KDD-Explorer', 'KDD Explorer'),
array('KIT-Fireball', 'KIT Fireball'),
array('kmbot', 'knowmore'),
array('KO_Yappo_Robot', 'KO Yappo'),
array('LabelGrab', 'LabelGrabber'),
array('larbin', 'Larbin'),
array('LeapTag', 'LeapTag News Reader'),
array('LexxeBot', 'Lexxe'),
array('lwp-trivial', 'libwww-perl'),
array('linkalarm', 'Link Alarm'),
array('Linkidator', 'Link Validator'),
array('LinkedInBot', 'LinkedIn'),
array('LinkScan Server', 'LinkScan'),
array('LinkWalker', 'LinkWalker'),
array('livedoorCheckers/', 'Livedoor Checkers'),
array('Lockon', 'Lockon'),
array('logo.gif crawler', 'logo.gif'),
array('LuminateBot', 'LuminateBot'),
array('Lycos', 'Lycos'),
array('Apple-PubSub/59', 'Mac OS X RSS'),
array('Magpie', 'Magpie'),
array('magpie-crawler', 'Magpie Crawler'),
array('Mail.Ru', 'Mail.Ru'),
array('MJ12bot', 'Majestics MJ12bot'),
array('Mammoth', 'Mammoth'),
array('Marvin', 'Marvin'),
array('marvin/infoseek', 'marvin/infoseek'),
array('M/3.8', 'Mattie'),
array('MediaFox', 'MediaFox'),
array('mercator', 'Mercator'),
array('MerzScope', 'MerzScope'),
array('METASpider', 'Meta'),
array('Metaeuro', 'MetaEuro'),
array('MetaGer-LinkChecker', 'MetaGer'),
array('MetaURI', 'MetaURI'),
array('MSR-ISRCCrawler', 'Microsoft Research'),
array('MindCrawler', 'MindCrawler'),
array('Miva', 'Miva'),
array('MLBot', 'MLBot'),
array('UdmSearch', 'MNO GoSearch'),
array('mnoGoSearch', 'mnoGoSearch'),
array('moget', 'Moget'),
array('MOMspider', 'MOM'),
array('Monster', 'Monster'),
array('Moreoverbot', 'Moreover'),
array('Mp3Bot', 'Mp3Realm'),
array('msnbot', 'MSNBot'),
array('msnbot-media', 'MSNBot (Media Search)'),
array('msnbot-mobile', 'MSNBot (Mobile)'),
array('msnbot-newsblogs', 'MSNBot (News Search)'),
array('msnbot-products', 'MSNBot (Product Search)'),
array('MSRBOT', 'MSRBot'),
array('MuscatFerret', 'Muscat Ferret'),
array('MwdSearch', 'Mwd.Search'),
array('Najdi.si', 'Najdi.si'),
array('NPBot', 'NameProtect'),
array('NaverBot', 'NaverBot'),
array('NEC-MeshExplorer', 'NEC MeshExplorer'),
array('Nederland.zoek', 'Nederland.zoek'),
array('NerdByNature.Bot', 'NerdByNature'),
array('NetCarta CyberPilot Pro', 'NetCarta WebMap'),
array('Netcraft', 'Netcraft Web Server Survey'),
array('NetMechanic', 'NetMechanic'),
array('NetNewsWire', 'NetNewsWire RSS'),
array('NetScoop', 'NetScoop'),
array('NIF', 'News is Free RSS'),
array('newscan-online', 'Newscan Online'),
array('(X11; compatible; [email protected]; HTTPClient 3.1', 'Newstin'),
array('NextGenSearchBot 1', 'NextGen Search Bot'),
array('NHSEWalker', 'NHSE Web Forager'),
array('NimbleCrawler', 'NimbleCrawler'),
array('NjuiceBot', 'NjuiceBot'),
array('Nomad', 'Nomad'),
array('Norbert the Spider', 'Norbert'),
array('Gulliver', 'Northern Light'),
array('Nutch', 'Nutch'),
array('explorersearch', 'NZ Explorer'),
array('Occam', 'Occam'),
array('Ocelli', 'Ocelli'),
array('omgilibot', 'Omgili'),
array('omgilibot/0.3 +http://www.omgili.com/Crawler.html', 'omgilibot'),
array('OneRiot', 'OneRiot'),
array('Me.dium', 'OneRiot.com'),
array('LargeSmall', 'OneSpot'),
array('Online24-Bot', 'Online24-Bot'),
array('OOZBOT', 'OOZBOT'),
array('Openbot', 'Openfind'),
array('Openfind', 'Openfind data gatherer'),
array('OpenISearch', 'OpenISearch'),
array('Orbsearch', 'Orb Search'),
array('OWPBot', 'OWPBot'),
array('PackRat', 'Pack Rat'),
array('PageBoy', 'PageBoy'),
array('panscient.com', 'Panscient'),
array('ParaSite', 'ParaSite'),
array('ParchBot', 'ParchmentHill'),
array('Patric', 'Patric'),
array('PEGASUS', 'Pegasus'),
array('PerlCrawler/1.0 Xavatoria/2.0', 'PerlCrawler 1.0'),
array('PGP-KA', 'PGP Key Agent'),
array('Duppies', 'Phantom'),
array('phpdig', 'PhpDig'),
array('psbot/0.1 (+http://www.picsearch.com/bot.html) (51dc65875976ac434c09274f7e46dec6)', 'Picsearch'),
array('PiltdownMan', 'Piltdown Man'),
array('Pimptrains robot', 'Pimptrain'),
array('pingalink', 'Ping A Link'),
array('pingdom.com_bot', 'Pingdom.com Bot'),
array('Pioneer', 'Pioneer'),
array('PluckFeedCrawler', 'Pluck'),
array('Plukkie', 'Plukkie'),
array('PlumtreeWebAccessor', 'Plumtree Web Accessor'),
array('PodNova', 'Pod Nova'),
array('Pompos', 'Pompos'),
array('Poppi', 'Poppi'),
array('gestaltIconoclast', 'Popular Iconoclast'),
array('PortalBSpider', 'Portal B'),
array('PortalJuice.com', 'Portal Juice'),
array('PostRank', 'PostRank'),
array('ProCogBot', 'ProCog Bot'),
array('psbot', 'PSBot'),
array('PycURL', 'PycURL'),
array('Qango.com Web Directory', 'Qango'),
array('R6_FeedFetcher', 'Radian6'),
array('R6_CommentReader', 'Radian6'),
array('R6_CommentReader', 'Radian6 Comment Reader'),
array('R6_FeedFetcher', 'Radian6 FeedFetcher'),
array('R6_FeedFetcher', 'Radian6 FeedFetcher'),
array('StackRambler', 'Rambler'),
array('Raven', 'Raven Search'),
array('RixBot', 'REBOL IndeXer'),
array('rdfbot', 'Rediff'),
array('Resume Robot', 'Resume Robot'),
array('Road Runner: ImageScape Robot', 'Road Runner: ImageScape Robot'),
array('RHCS', 'RoadHouse Crawling System'),
array('Robbie', 'Robbie'),
array('RoboCrawl', 'RoboCrawl'),
array('Robofox', 'RoboFox'),
array('Robot du CRIM 1.0a', 'Robot Francoroute'),
array('Robozilla', 'Robozilla'),
array('Roverbot', 'Roverbot'),
array('RSS-SPIDER', 'RSS Feed Seeker'),
array('RuLeS', 'RuLeS'),
array('SafetyNet Robot', 'SafetyNet'),
array('SBIder', 'SBIder RSS'),
array('Scarlett', 'Scarlett'),
array('Scharia', 'Scharia'),
array('Science-Index', 'Science Index'),
array('ScooperBot', 'ScooperBot'),
array('ScoutJet', 'ScoutJet'),
array('Scrubby/3.0', 'Scrubby'),
array('SearchNZ', 'Search NZ'),
array('search17', 'Search17'),
array('searchprocess', 'SearchProcess'),
array('SBSearch', 'Secret Search Engine Labs'),
array('Seekbot', 'Seekbot'),
array('SemrushBot', 'SemrushBot'),
array('SemtoBot', 'SemtoBot'),
array('Senrigan', 'Senrigan'),
array('Sensis Web Crawler', 'Sensis'),
array('spbot', 'SEOprofiler'),
array('ServiceUptime.robot', 'ServiceUptime'),
array('SeznamBot', 'Seznam Fulltext Blog'),
array('SG-Scout', 'SG Scout'),
array('Shagseeker', 'ShagSeeker'),
array('ShaiHulud', '<![CDATA[ShaiHulud]]>'),
array('SheenBot', 'SheenBot'),
array('ShopWiki', '<![CDATA[Shopwiki [Bot]]]>'),
array('SimilarPages/Nutch', '<![CDATA[SimilarPages/Nutch [Crawler]]]>'),
array('SimBot/1.0', 'Simmany Robot Ver 1.0'),
array('ssearcher100', 'Site Searcher'),
array('Site Valet', 'Site Valet'),
array('http://www.site-list.net', 'Site-List RSS'),
array('SiteBot', 'SiteBot'),
array('SiteTech-Rover', 'SiteTech-Rover'),
array('SiteUptime.com', 'SiteUptime'),
array('SiteVibeBot', 'SiteVibeBot'),
array('+SitiDi.net/SitiDiBot/', 'SitiDi'),
array('SkimBot', 'SkimBot'),
array('aWapClient', 'Skymob'),
array('SLCrawler', 'SLCrawler'),
array('Sleek Spider', 'Sleek'),
array('Snapbot', 'Snap Shots'),
array('Snapbot/1.0', 'Snapbot'),
array('SnapPreviewBot', 'SnapPreviewBot'),
array('Snooper', 'Snooper'),
array('socbot', 'SocBot'),
array('Sogou web spider', 'Sogou'),
array('sohu-search', 'Sohu Search'),
array('Solbot', 'Solbot'),
array('Sosospider', 'Soso'),
array('www.entireweb.com/speedy.html', 'Speedy'),
array('Speedy', 'Speedy'),
array('Sphere Scout', 'Sphere'),
array('Sphider2', 'Sphider'),
array('mouse.house', 'Spider Monkey'),
array('SpiderBot', 'SpiderBot'),
array('spiderline', 'Spiderline Crawler'),
array('SpiderMan', 'SpiderMan'),
array('SpiderPig', 'SpiderPig'),
array('SpiderView', 'SpiderView'),
array('Spinn3r', 'Spinn3r'),
array('squadbot', 'SQuADbot'),
array('suke', 'Suke'),
array('suntek', 'Suntek Search Engine'),
array('superbot.com', 'Super.info Search Bot'),
array('Superfeedr', 'Superfeedr'),
array('Synthesio', 'Synthesio'),
array('Szukacz', 'Szukacz'),
array('Black Widow', 'TACH Black Widow'),
array('Tagoobot', 'Tagoo.ru'),
array('tailsweepblogcrawler', 'Tailsweep'),
array('Tarantula', 'Tarantula'),
array('tarspider', 'TarSpider'),
array('dlw3robot', 'Tcl W3 Robot'),
array('TechBOT', 'TechBOT'),
array('Technoratibot', 'Technorati'),
array('Templeton', 'Templeton'),
array('teoma', 'Teoma/Ask Jeeves'),
array('JubiiRobot', 'The Jubii'),
array('NorthStar', 'The NorthStar Robot'),
array('w3index', 'The NWI Robot'),
array('Peregrinator-Mathematics', 'The Peregrinator'),
array('thumbshots-de-Bot', 'Thumbshots'),
array('T-H-U-N-D-E-R-S-T-O-N-E', 'Thunderstone'),
array('TinEye', 'TinEye'),
array('TITAN', 'Titan'),
array('TitIn', 'TitIn'),
array('TLSpider', 'TLSpider'),
array('turnitinbot', 'Turn it in'),
array('slysearch', 'Turn it in slysearch'),
array('TurtleScanner', 'Turtle'),
array('Tweetmeme', 'Tweetmeme.com'),
array('Twiceler', 'Twiceler (Cuill.com)'),
array('Twingbot', 'Twingbot'),
array('Twingly', 'Twingly'),
array('Twitterbot', 'Twitterbot'),
array('Twitturls', 'Twitturls.com'),
array('Python-urllib', 'Twitturls.com (Python-urllib)'),
array('UCSD-Crawler', 'UCSD Crawl'),
array('UMBC-memeta-Bot', 'UMBC RSS'),
array('Unpartisan', 'Unpartisan RSS'),
array('urlck', 'URL Check'),
array('URL Spider Pro', 'URL Spider Pro'),
array('Valkyrie', 'Valkyrie'),
array('ClickSense', 'ValueClick LM'),
array('Mozilla/4.0 (vBSEO; http://www.vbseo.com)', 'vBSEO'),
array('Verticrawl', 'Verticrawl'),
array('Victoria', 'Victoria'),
array('vision-search', 'Vision Search'),
array('Visions Search', 'Visions'),
array('voyager/1.0', 'Voyager'),
array('VWbot_K', 'VWbot'),
array('W3C-checklink', 'W3C'),
array('W3C_CSS_Validator', 'W3C CSS Validator'),
array('W3C_Validator', 'W3C Validator'),
array('Unicorn', '<![CDATA[W3Cs United Validator]]>'),
array('W3M2', 'W3M2'),
array('w3mir', 'W3mir'),
array('w@pspider', 'w@p'),
array('appie', 'Walhello Appie'),
array('CrawlPaper', 'WallPaper'),
array('root', 'Web Core / Roots'),
array('WebMoose', 'Web Moose'),
array('WebAlta', 'WebAlta'),
array('WebAlta Crawler', 'WebAlta'),
array('WebBandit', 'WebBandit'),
array('WebCatcher', 'WebCatcher'),
array('Webclipping', 'Webclipping'),
array('WebCopy', 'WebCopy'),
array('WebFetcher', 'WebFetcher'),
array('weblayers', 'WebLayers'),
array('WebLinker', 'WebLinker'),
array('wlm', 'Weblog Monitor'),
array('WebQuest', 'WebQuest'),
array('WebReaper', 'WebReaper'),
array('[email protected]', 'Webs'),
array('websearchbench', 'WebSearchBench'),
array('WOLP', 'WebStolperer'),
array('webvac', 'WebVac'),
array('webwalk', 'WebWalk'),
array('WebWalker', 'WebWalker'),
array('WebWatch', 'WebWatch'),
array('www.WebWombat.com.au', 'WebWombat'),
array('Wget', 'Wget'),
array('whatUseek_winona', 'What U Seek Winona'),
array('Whitevector Crawler', 'Whitevector Crawler'),
array('www.whoisde.de', 'Whois DE'),
array('SurveyBot', 'Whois Source'),
array('wikiwix', 'Wikiwix'),
array('Hazels Ferret Web hopper', 'Wild Ferret Web Hopper'),
array('Willow Internet Crawler by Twotrees', 'Willow'),
array('Windows-Live-Social-Object-Extractor-Engine', 'Windows Live SOEE'),
array('Windows-RSS-Platform/1.0', 'Windows RSS Platform 1.0'),
array('Windows-RSS-Platform/2.0', 'Windows RSS Platform 2.0'),
array('WinHTTP', 'WinHTTP'),
array('wired-digital-newsbot', 'Wired Digital'),
array('Bilbo', 'Wise-Guys'),
array('Vagabondo', 'Wise-Guys'),
array('zyborg', 'WiseNut'),
array('WordPress', 'WordPress'),
array('woriobot', 'Worio'),
array('OmniExplorer_Bot', 'WorldIndexer'),
array('Project Kolinka Forum Search', 'www.kolinka.com'),
array('WWWC', 'WWWC'),
array('WWWeasel Robot', 'WWWeasel'),
array('wwwster', 'WWWSter'),
array('WWWWanderer', 'WWWWanderer'),
array('TECOMAC-Crawler', 'X-Crawler'),
array('Xenu', 'Xenu Link Sleuth'),
array('XGET', 'XGET'),
array('cosmos', 'XYLEME Robot'),
array('yacybot', 'YaCy'),
array('YahooYSMcm', 'Yahoo Publisher Network'),
array('Yahoo-Blogs', 'Yahoo! Blogs'),
array('YahooFeedSeeker', 'Yahoo! FeedSeeker'),
array('Yahoo-MMCrawler', 'Yahoo! Image Search'),
array('YahooSeeker/M1A1-R2D2', 'Yahoo! Mobile'),
array('Yahoo! Slurp', 'Yahoo! Slurp'),
array('Yahoo-VerticalCrawler', 'Yahoo! Vertical Crawler'),
array('YandexAntivirus', 'Yandex Antivirus'),
array('YandexBlog', 'Yandex Blog'),
array('YandexBot', 'Yandex Bot'),
array('YandexCatalog', 'Yandex Catalog'),
array('YandexDirect', 'Yandex Direct'),
array('YandexFavicon', 'Yandex Favicon'),
array('YandexImageResizer', 'Yandex ImageResizer'),
array('YandexImages', 'Yandex Images'),
array('YandexMedia', 'Yandex Media'),
array('YandexMetrika', 'Yandex Metrika'),
array('YandexNews', 'Yandex News'),
array('YandexPagechecker', 'Yandex Pagechecker'),
array('YandexVideo', 'Yandex Video'),
array('YandexWebmaster', 'Yandex Webmaster'),
array('YandexZakladki', 'Yandex Zakladki'),
array('Yanga WorldSearch Bot', 'Yanga'),
array('Yanga WorldSearch Bot', 'Yanga WorldSearch Bot'),
array('YebolBot', 'YebolBot'),
array('yeti', 'Yeti'),
array('Yeti', 'Yeti'),
array('Yeti/1.0', 'Yeti/1.0'),
array('YodaoBot', 'Yodao'),
array('YoudaoBot', 'Youdao'),
array('YRSpider', 'YunRang'),
array('zeus', 'Zeus Internet Marketing'),
array('http://www.zorkk.com', 'Zork RSS'),
sorry--- noob here. How exactly do I make use of the list?
Quote from: SleePy on April 10, 2008, 11:01:45 PM
For manual installation, just upload AddMoreSpiders.php to your SMF directory and run it in your browser (then delete the file).
As you see it's very easy to do. In your case you have to upload the attached file, not the one in the original mod.
Ufff!!!... this MOD is great, but don't work for SMF 2.0.7.... the autor or any, can UpDate for the last versions of SMF??? may be yes... I hope yes :), Thanks ;)
Thanks & Regards 8)
Er this mod does work on 2.0.7
Read the Mod Emulate link in my signature.
i know this is old, and I saw where kokette made an update.....
any chance on someone making another update :P
thank you
today I had 26 guests online at them same time..... yeah I know they were like 24 spiders and maybe 2 guests :p
would like for it to figure out which crawlers they were
Did you turn on the spider tracking? It's not on by default even with this mod installed.
in the search engine settings, yes I have it set to high, applied restrictions to a bot group i created, and show spider names
Then the bots you get, if they are bots, are not ones on this list.
*giggles* which is why I would like for someone to update this :)
I'm guessing the bots were from yandax, bing, or addthis as I had been submitting my site to be indexed and then I have addthis widgets on my site..... I mean I could be wrong but that's my guess anyways as to where the bots were from
You know you could add the spiders you get to your own site, right?
Quote from: Arantor on September 27, 2019, 02:32:43 AM
You know you could add the spiders you get to your own site, right?
I'm not sure how, that's the problem....
I mean add ips to the list of spiders in the admin when I find them after searching the ip, but other than that, I don't know how ;) what files do I need to look for
In the "Search Engines" area you can go to Add Spider and add a spider based on its IP or user agent.
User Agent is way more reliable and you get that from your logs.
How to Track AI Spiders?
What's changed in the latest release?