Simple Machines Community Forum

SMF Support => SMF 2.0.x Support => Topic started by: ~DS~ on February 12, 2010, 07:28:27 PM

Title: Content Scrapers...spammer?
Post by: ~DS~ on February 12, 2010, 07:28:27 PM
Today I got many hits from 174.133.177.66 in the error logs.
I ban the ip, it was no good. it still index the forum and access every url or pages. I was told it was a content scraper. I unban the ip because I thought it would help the traffic.
Title: Re: Content Scrapers...spammer?
Post by: Garou on February 12, 2010, 08:28:34 PM
That IP if not spoofed, is static and belongs to ThePlanet.com you may try contacting them about the scraper.

I dont why banning it wouldn't solve the issue for SMF. Perhaps try blocking the IP through your ISP's control panel.
Title: Re: Content Scrapers...spammer?
Post by: busterone on February 12, 2010, 10:33:06 PM
I banned it in .htaccess quite a while back. It was attempting all sorts of access, and totally ignoring my rules set up in robots.txt. 
Title: Re: Content Scrapers...spammer?
Post by: ~DS~ on February 12, 2010, 10:40:36 PM
Quote from: busterone on February 12, 2010, 10:33:06 PM
I banned it in .htaccess quite a while back. It was attempting all sorts of access, and totally ignoring my rules set up in robots.txt.
Rules in robots.txt? Sorry I am a newbie.
Title: Re: Content Scrapers...spammer?
Post by: busterone on February 12, 2010, 11:04:18 PM
That's ok. I was too at one time. Take a look here and it will explain it better than I can.
There's a lot of stuff here on simplemachines.org about it too.  :)
http://www.robotstxt.org/robotstxt.html
Title: Re: Content Scrapers...spammer?
Post by: vbgamer45 on February 12, 2010, 11:09:29 PM
Scrapers don't normally respect robots.txt though
Title: Re: Content Scrapers...spammer?
Post by: busterone on February 12, 2010, 11:11:41 PM
Very true. That is why I just blocked it from all my sites.  :)
Title: Re: Content Scrapers...spammer?
Post by: ~DS~ on February 12, 2010, 11:19:18 PM
Quote from: busterone on February 12, 2010, 11:11:41 PM
Very true. That is why I just blocked it from all my sites.  :)
Blocked? you mean banned? If so, how do you block, where and how?
Title: Re: Content Scrapers...spammer?
Post by: busterone on February 12, 2010, 11:39:22 PM
You can ban the ip in your hosts cpanel, or manually in your .htaccess file.
Here is an example from my .htaccess file. I x'd out the other addresses,
and did not include the entire file but you can get an idea of what it looks like.

## USER IP BANNING
<Limit GET POST>
order allow,deny
deny from 174.133.177.66
deny from XXX.XXX
deny from XXX
deny from XXX.XXX
allow from all
Title: Re: Content Scrapers...spammer?
Post by: ~DS~ on February 12, 2010, 11:42:58 PM
Perfect thanks. I am learning.  :)