Simple Machines Community Forum

SMF Support => SMF 2.1.x Support => Topic started by: Saiy on July 20, 2024, 05:32:43 PM

Title: [Google Search Console] A lot of error 404 due to cron.php
Post by: Saiy on July 20, 2024, 05:32:43 PM
Hello guys,


My website is www.db-z.com, but my forum is in www.db-z.com/forum (that's very important for the following problem).

In Google Search Console, I have like +40k error 404 caused by the cron.php code, like this one :

This error 404 : https://www.db-z.com/cron.php?ts=1720894020

Found by Google on this page : https://www.db-z.com/forum/index.php/topic,4019.6285.html
And indeed, in the source code of this page, there's this :

window.addEventListener("DOMContentLoaded", function() {
function triggerCron()
{
$.get('https://www.db-z.com/forum' + "/cron.php?ts=1721510745");

The funny thing is......... My forum "home" is https://www.db-z.com/forum, so why Google tells me there's a 404 with this URL ? --> https://www.db-z.com/cron.php?ts=1720894020 (without the "/forum" in the URL).

Thank you
Title: Re: [Google Search Console] A lot of error 404 due to cron.php
Post by: Arantor on July 20, 2024, 06:40:43 PM
Because Google is mis-reading the URL. It sees the /cron.php part and visits it without realising it's wrong because it assumes it is your domain + /cron.php, namely db-z.com/cron.php

I wouldn't normally encourage this but the quickest fix is to change Sources/Load.php:

$.get(' . JavaScriptEscape($boardurl) . ' + "/cron.php?ts=' . $ts . '");
to
$.get("https://www.db-z.com/forum/cron.php?ts=' . $ts . '");
The problem is that writing the logic to correctly decompose $boardurl like this is not the easiest in the world and I'm not sure there aren't weird bugs that will happen otherwise. But this way will be consistent for you if nothing else.
Title: Re: [Google Search Console] A lot of error 404 due to cron.php
Post by: shawnb61 on July 20, 2024, 07:33:56 PM
I just updated robots.txt & now Google leaves it alone.
Title: Re: [Google Search Console] A lot of error 404 due to cron.php
Post by: Arantor on July 20, 2024, 07:36:05 PM
Hopefully you have enough active visitors to keep triggering scheduled tasks in the meantime?
Title: Re: [Google Search Console] A lot of error 404 due to cron.php
Post by: Saiy on July 21, 2024, 04:06:58 AM
Quote from: shawnb61 on July 20, 2024, 07:33:56 PMI just updated robots.txt & now Google leaves it alone.

Thank you so much @Arantor ! I wanted to avoid to manually modify the source files but yes, Google is really spamming me. ;D

@shawnb61 : I tried to do it yesterday also : Disallow: /cron.php