Simple Machines Community Forum

SMF Development => Bug Reports => Fixed or Bogus Bugs => Topic started by: woolly bugger on December 12, 2019, 09:36:25 AM

Title: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 12, 2019, 09:36:25 AM
When I see hundreds of MSN spiders crawling my board I get huge number of erro
A lot of the are looking at the grep

How should I put a stop to this?
Title: Re: MSN spiders cause massive numbers of errors
Post by: Illori on December 12, 2019, 09:40:01 AM
I don't see that you have attached a screenshot of your error log, just that you have errors and the spiders in the who's online page. without knowing the errors we cannot help fix them.
Title: Re: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 12, 2019, 12:44:09 PM
with 229,374 errors that very in degree I didn't bother... but will the next time they occur, as I deleted them this time...  my bad
Title: Re: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 12, 2019, 02:57:16 PM
Here you go
Title: Re: MSN spiders cause massive numbers of errors
Post by: Illori on December 12, 2019, 03:00:37 PM
which version of SMF 2.1 are you using?
Title: Re: MSN spiders cause massive numbers of errors
Post by: mantu2 on December 13, 2019, 01:33:22 AM
Hi,

I have had the same problem for a small while now. At the moment around 30 000 error hits. Errors occurs daily at the same time. The amount just variate a bit. The error message is same on my forum. Version is 2.1.RC2. I hope there is an solution.
Title: Re: MSN spiders cause massive numbers of errors
Post by: Illori on December 13, 2019, 05:03:57 AM
you should upgrade to the latest version on github if you are not using it already. if you still get the error let us know.
Title: Re: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 13, 2019, 01:55:46 PM
this is what i was using from 11/22

https://github.com/SimpleMachines/SMF2.1/compare/96865d4...release-2.1
Title: Re: MSN spiders cause massive numbers of errors
Post by: shawnb61 on December 13, 2019, 08:16:19 PM
Confirming:  You're saying you uploaded a whole new set of files as of 11/22?  (Not just that one PR, correct?)

If so... 

How many rows do you have in log_spider_stats?

Do you see entries in log_spider_stats across multiple days for MSN?

What spider logging level do you have?  (Standard, moderate, aggresive?  Under Admin | Forum | Search Engines | Settings)

If you're on current code, this appears to be two separate issues:
1) Too many pagehits; the field has a 65K max that is being exceeded
2) A bunch of undefined entries; it's possible that they're not defined for guests/bots

I am wondering if the date isn't being incremented somehow...  OTOH, if bing is hitting you >65K times in a day, well, we have another problem.

Title: Re: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 14, 2019, 12:51:12 PM
I was getting ready to upgrade to the latest github release before I read your reply, so I put the forum in maintenance mode, emptied out unimportant logs and exported the database..

then I checked this forum and read your reply.

I did the upgrade of all files on 11/22

The Search Engine Tracking level is Standard

what is up with  smf_log_search_words ? see ATTACHED,





Title: Re: MSN spiders cause massive numbers of errors
Post by: shawnb61 on December 14, 2019, 12:58:41 PM
Quote from: shawnb61 on December 13, 2019, 08:16:19 PM
Do you see entries in log_spider_stats across multiple days for MSN?

Could you dump some of that content?  It would help to see what that looks like...


(log_search_words is your search index when using a custom index - that's normal...)
Title: Re: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 14, 2019, 01:03:07 PM
see attached

also showing my mods...


Title: Re: MSN spiders cause massive numbers of errors
Post by: shawnb61 on December 14, 2019, 02:30:17 PM
Perfect. 

Yep, looks like in a recrawl, the value of page-hits can be legitimately exceeded when tracking stats. 
Logged:  https://github.com/SimpleMachines/SMF2.1/issues/5890

The 'undefined' issues are possibly a byproduct, not sure.  We should try a fix for that & see if they go away.

I am going to move this to the Bug Reports board.
Title: Re: MSN spiders cause massive numbers of errors
Post by: woolly bugger on December 14, 2019, 11:58:33 PM
Maybe this will help
Title: Re: MSN spiders cause massive numbers of errors
Post by: shawnb61 on December 15, 2019, 10:20:31 PM
Quote from: woolly bugger on December 12, 2019, 09:36:25 AM
How should I put a stop to this?

To eliminate these errors going forward, I would suggest changing the page_hits column in the smf_log_spider_stats table from smallint to int. 

I cannot replicate the undefined errors you have (my first suspicion is Tapatalk...).  But I'd suggest changing page_hits to INT as a first step to cleaning up your logs.

PR submitted:
https://github.com/SimpleMachines/SMF2.1/pull/5896
Title: Re: MSN spiders cause massive numbers of errors
Post by: shawnb61 on February 13, 2020, 02:06:07 AM
The fix for this issue has been merged, so this will be closed.

The fix is available on the latest version of 2.1 over on GitHub.