Simple Machines Community Forum

SMF Support => Server Performance and Configuration => Topic started by: w0lfman on September 16, 2023, 04:11:02 AM

Title: Guest visitors
Post by: w0lfman on September 16, 2023, 04:11:02 AM
I'm seeing a dramatic change in the number of guest visitors over these past 4-5 weeks. I consistently see guest numbers that are four to five times larger than what has historically been normal. All hours of day and night high amount of guests. Ive made no changes to robots text so I'm not certain whether it's bots crawling my site or what. How can I tell the difference between an actual guest user being real or some sort of bot?  How can I prevent them if they are not real? Thanks!
Title: Re: Guest visitors
Post by: Aleksi "Lex" Kilpinen on September 16, 2023, 07:43:44 AM
Unless they cause you issues (like increase in bandwidth costs or performance issues) it is generally best to try and ignore things like this. The internet is full of bots and crawlers, most of which will not cause issues to you, and trying to identify them all is a neverending task.
Title: Re: Guest visitors
Post by: w0lfman on September 16, 2023, 08:22:38 AM
So far so good on cost, but I have had host gator shut me down in the past when I had been flooded with guest which I believe ended up being spider bots. I did see where I could make changes to .htaccess, but really uncertain what I should change.
Title: Re: Guest visitors
Post by: mickjav on September 16, 2023, 09:12:26 AM
I would move to a better host if they did that to me.

Look at https://clients.hostit.host/index.php

He's Understands SMF lol, And I've been with him for years without issue.
Title: Re: Guest visitors
Post by: w0lfman on September 16, 2023, 02:22:36 PM
Quote from: mickjav on September 16, 2023, 09:12:26 AMI would move to a better host if they did that to me.

Look at https://clients.hostit.host/index.php

He's Understands SMF lol, And I've been with him for years without issue.
I'll check that out
Title: Re: Guest visitors
Post by: durangod on December 21, 2023, 02:03:37 AM
Create a robots.txt file and a no_agents.php file. 

Inside the robots.txt file put this


#User-agent: *
#Disallow: /
#Disallow: no_agents.php



Then inside the no_agents.php file put this


<?php

//nothing

?>


<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
</head>

<body>
   
<h2>Private Directory</h2>

<div>
<p>You have reached a private directory for customers only. If you are a spider or web directory agent please remove and block this url from your list.</p>
</div>

</body>
</html>



This will block crawling of your site if that is what you want, that means any crawlers.
Title: Re: Guest visitors
Post by: Aleksi "Lex" Kilpinen on December 21, 2023, 04:44:45 AM
Robots.txt is a good tool, but sadly only the good actors follow robots.txt. Robots.txt is only a request you present to the crawlers, there's no obligation for them to actually comply, and the problematic ones seldom do.
Title: Re: Guest visitors
Post by: durangod on December 21, 2023, 10:13:12 AM
Quote from: Aleksi "Lex" Kilpinen on December 21, 2023, 04:44:45 AMRobots.txt is a good tool, but sadly only the good actors follow robots.txt. Robots.txt is only a request you present to the crawlers, there's no obligation for them to actually comply, and the problematic ones seldom do.

Quite correct, if they comply, i should have been clear on that part :)