News:

Join the Facebook Fan Page.

Main Menu

Bot User Agents

Started by movierchives, May 14, 2012, 12:07:49 PM

Previous topic - Next topic

movierchives

I've noticed I'm getting a few Microsoft IP addresses visiting my site but have no idea what they are.  I already have the Bing etc agents so wonder if anyone knows what this is

157.55.16.86

I'm also getting Facebook bots but again have no idea what the user agent is,

69.171.230.248

vbgamer45

Community Suite for SMF - Take your forum to the next level built for SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com -  Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

movierchives

Quote from: vbgamer45 on May 14, 2012, 12:30:22 PM
http://ip2location.com/157.55.16.86

That is a microsoft ip so maybe bing?
No I don't think it is related to Bing, they have their own agents but they are clearly doing something and the Facebook one is hanging around a lot too

emanuele

Maybe a Microsoft employee? :P

Or maybe a bing bot:
http://www.projecthoneypot.org/ip_157.55.16.86
Quote157.55.16.86's User Agent Strings
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
msnbot/2.0b (+http://search.msn.com/msnbot.htm)


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

ApplianceJunk

QuoteMaybe a Microsoft employee?

Maybe Bill Gates? :P

Got a link to your site?

We have a few members that have registered over the years with @whirlpool.com email address.
When I check their IP it shows Whirlpool corporation.

movierchives

#5
I know this is an oldish thread but I've found the answer and it may come in handy to others.

The bot is bingbot, user agent bingbot

Why Microsoft need two different bots to index is beyond me but thats it

MrPhil

Quote from: movierchives on July 10, 2012, 11:46:16 AM
Why Microsoft need two different bots to index is beyond be but thats it

Probably two rival groups at MS fighting it out to get atop the "stack rankings" and live, while the enemy group is killed off.

Arantor

Define 'two different bots'. MSNbot used to be called MSNbot, now it's called Bingbot because they renamed their search engine to be called Bing instead of MSN Live Search.
No good deed goes unpunished / All helpful urges should be circumvented

I have something to say: it's better to burn out than to fade away. There can be only one.

movierchives

Quote from: Arantor on July 10, 2012, 12:33:57 PM
Define 'two different bots'. MSNbot used to be called MSNbot, now it's called Bingbot because they renamed their search engine to be called Bing instead of MSN Live Search.

Simple really, I get indexed by the msnbot, msn agents and now the bingbot too

Arantor

Genuine Bing bots all use bingbot, most of the 'MSN agents' aren't really MSN but spammer bots. Even some of the Bing bot requests I see aren't really Bing.
No good deed goes unpunished / All helpful urges should be circumvented

I have something to say: it's better to burn out than to fade away. There can be only one.

movierchives

The Whois says their Microsoft, I always check because I ban the ones which are no good to me like the Baidu ones

Arantor

Hmm, it's been at least a year since I saw a legitimate MSN bot that identifies itself as MSN bot...
No good deed goes unpunished / All helpful urges should be circumvented

I have something to say: it's better to burn out than to fade away. There can be only one.

movierchives

Quote from: Arantor on July 10, 2012, 05:20:27 PM
Hmm, it's been at least a year since I saw a legitimate MSN bot that identifies itself as MSN bot...
Nope still get them too so MS are duplicating the indexing.

I've also noticed two new ones.  One called brandwatch (magpie-crawler) and Majestic-12 (MJ12bot) which are hammering away

EDIT: Oh and facebook and amazon ips!

Arantor

Magpie and MJ12 are both blocked by several blocking tools for being abusive engines.

But all the hits I've seen from 'MSN' on my sites for months have all bots pretending to be MSN.
No good deed goes unpunished / All helpful urges should be circumvented

I have something to say: it's better to burn out than to fade away. There can be only one.

butchs

It is easy to spoof a bot IP address.  If you want to confirm it is legit then one method (that is not perfect) is to check the "X-Forwarded-For" list of IP addresses. 

Quoteper Wiki:
The general format of the field is:  X-Forwarded-For: client, proxy1, proxy2

where the value is a comma+space separated list of IP addresses, the left-most being the original client, and each successive proxy that passed the request adding the IP address where it received the request from. In this example, the request passed proxy1, proxy2 and proxy3 (proxy3 appears as remote address of the request).

Any IP address in the list can access the information seen another IP address in the list.

For example you may get "X-Forwarded-For: 68.35.128.190, 157.55.16.86".  SMF sees "157.55.16.86"  But "68.35.128.190" has a Project Honey-pot threat rating of 17 or worse...

Therefore, based on the information you provided one can only assume the ip address visiting your site is correct.
::)

I have been truly inspired by the SUGGESTIONS as I sit on my throne and contemplate the wisdom imposed upon me.

Arantor

Something else I've noticed is that legitimate Google and MSN bots also send a 'From' header too, in case that's useful for tracking.
No good deed goes unpunished / All helpful urges should be circumvented

I have something to say: it's better to burn out than to fade away. There can be only one.

movierchives

Does anyone know what the Facebook user agents are

I keep getting their bots, two examples
69.63.190.250
69.63.190.245

MrPhil

Well, whois says those two IP addresses belong to Facebook, but they don't say anything about what they are. If there aren't any tools to look them up, you might temporarily add some code to index.php to log somewhere the $_SERVER['HTTP_USER_AGENT'] and $_SERVER['REMOTE_ADDR'] when the REMOTE_ADDR matches either of the two IP addresses. It could probably go into the error log, but you should check frequently so as not to fill up the log!
include_once('Sources/Errors.php');
if ($_SERVER['REMOTE_ADDR'] == '69.63.190.250' || $_SERVER['REMOTE_ADDR'] == '69.63.190.245')
  log_error("IP: " . $_SERVER['REMOTE_ADDR'] . ", Agent: " . $_SERVER['HTTP_USER_AGENT']);


This is not tested, so you may have to make some tweaks to it. Remove the code as soon as you get a hit on both addresses.

butchs

Quote from: Arantor on July 10, 2012, 10:29:38 PM
Something else I've noticed is that legitimate Google and MSN bots also send a 'From' header too, in case that's useful for tracking.

Interesting, I have not seen that in the US.
I have been truly inspired by the SUGGESTIONS as I sit on my throne and contemplate the wisdom imposed upon me.

Igal-Incapsula

Hi
You can verify Microsoft, FB and other user agents via hxxp:botopedia.org [nonactive].
Once there, use Search to find out Bot details, including all user-agent data.
Also, on Bot Profile page you can cross-check the IP to find out if it's a legitimate one, for that specific Bot.

Here is a direct link to FB bot profile:
www.botopedia.org/index.php?option=com_k2&view=item&id=313:facebook-external-hit

Hope this helps.

Advertisement: