SEO Indexing Tools: SMF Sitemaps (SMF 1.0.5-1.1RC2)

Started by Davilac, December 07, 2005, 02:28:36 PM

Previous topic - Next topic

Davilac

Mod edit: This post is outdated. You can now find SEO mods on the Mod Site

Current version: 0.5
Build for 1.1RC, not sure if works for lower versions.

Spanish Support

As many requested my help for something as important as Sitemaps, I have some work. SMF is well SEO optimized, but these scripts will help search engines to index all your topic's, so you will gain visits.

You can download these files in zip or rar.

Please, read first, and then ask.

Usage:

You must uncompress the folder and upload it to your forum's root, so if www.domain/com/forum is your forum, you know should have www.domain/com/forum/sitemaps.

Spanish, english and german files included in this release. Please, post here your translations.

How to use your language:
Just copy a language file in /lang and named it as you want. Translate what is after "=>". Then go to index.php, to line 7, and edit require("lang/en.php"); to fit your language file name.

Google Sitemap (sitemaps/sitemaps.php)

File named sitemaps.php is a Google Sitemap for SMF. Google Sitemaps for SMF is written by SMF and modified by Davilac (http://www.davilac.net) This Sitemap makes a sitemap with all forum's URL (giving them highest priority), all topics in your forum (up to 20.000 topics, as Google Sitemaps accept up to 30.000 urls per sitemap) sorted by popularity and also includes up to 200 most active users. Also this mod detects if you have short URLS or not. If you have it, this script will use it in the sitemap, if not, it will use PHP URLS.

Google Sitemaps is a new service from Google that has been tested and seems to be very good, indexing your websites very fast.

No install needed for this script, just go to http://www.google.com/webmasters/sitemaps, log in, Add a site and add the sitemap giving the URL: http://www.domain.com/forum/sitemaps/sitemaps.php In a few weeks Google will index spider your forum. You will see that searching in Google for site:www.domain.com You can submit to Google as many sitemaps as you want for as many domains as you want.

Yahoo! Urllist (sitemaps/urllist.php)


Last september Yahoo! launched Yahoo! Site Explorer to help webmasters, and added features to free request, features like a kind of Yahoo! Sitemap. With an urllist you can help Yahoo! index your forum. As Yahoo! is not good indexing, urllist will be very good for you after three weeks. Urllist is:
QuoteA text file containing a list of URLs, each URL at the start of a new line

urllist.php will make an urllist, but as far as my english is not good enough I'm not sure if must be called .txt or not, please confirm that.

HTML Sitemap

index.php will make a HTML sitemap for you site, very usefull to all search engines. In the folder you will find a file named .htaccess, which will make nice URLs for this file. This file will display some links to your forum.
You must access the file in the following forms:

If you access the file by going to home.html, you will get all board's url. Also some Sitemap internal links as below.
I you access the file by going to (number).html you will get 100 topics' url, sortered by date, so search engines will get all new topics every time they go to 1.html.

Recommended use:


The recommended use is to get important links to 0.html in order to have search engines allways crawling your latests topics. So you should add a link in your template to www.yourforum.com/sitemaps/0.html and also get links to 1.html if possible.

You can preview this file going to www.davilac.net/foro/sitemaps/home.html or www.davilac.net/foro/sitemaps/1.html

Please, give me suggestions, check my english  :D, ask questions and ask for support.

xcooling

FANTASTIC !!

Please included these files with SMF package, it really does help !!

Thanks !

Kindred

hmmm....  we'll see how this works.
I run an SMF board wrapped with Mambo...

So, I made some minor changes to the top of each of the php files to include the mambo-forum url rather than the plain smf url...

I will let you know how google and yahoo repsond.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

xcooling

ok google and yahoo are loving my site, ive got 7 spiders 4google, 3 yahoo atm.

dschwab9

#4
Great job!   One thing you might want to check though - it appears that the index.php is showing guests links to topics they don't have access to.  Also, the data in the Views and Replies columns is backwards.

Davilac

#5
Quote from: dschwab9 on December 08, 2005, 01:36:32 AM
Great job!   One thing you might want to check though - it appears that the index.php is showing guests links to topics they don't have access to.  Also, the data in the Views and Replies columns is backwards.

Views and replies fixed in version 0.2. About to access data they shouln't, you are right. I have fixed to get board's they souldn't in home.html I will try to fix about topics.

But this bug is not too important because spiders or guest won't be able to access the topic anyway, but better if I fix that.

Davilac

I'm studing also to make a robots.txt file for SMF, 'cause SMF has some problems developers wrongly don't think important. But I will test it before, cause can be dangerous for an old forum. Comments about this? Does someone know very well about advanced robots.txt files?

Kindred

Ok.... comments.
as dschwab noted, the sitemap does not distinguish between public and not-public posts.

I have a management only section on my site where we discuss the site and sometimes discuss the behaviour of particular members...  These threads should **NOT** be visible in the sitemap (and hence visible to the bots and anyone else)
I know that a guest user who does not have access to the actual message once inside SMF will get an error when they try to click on the message, but the fact that the message appears at all in the list is not acceptable, since some of these message titles are not "politically correct"

What are your questions on robots.txt?   I have a fairly comprehensive one being used on my site...
and, I am curious, what problems do you think exist in SMF that the developers don't think are important?
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Davilac

Quote from: Kindred on December 08, 2005, 10:17:11 AM
Ok.... comments.
as dschwab noted, the sitemap does not distinguish between public and not-public posts.

I have a management only section on my site where we discuss the site and sometimes discuss the behaviour of particular members...  These threads should **NOT** be visible in the sitemap (and hence visible to the bots and anyone else)
I know that a guest user who does not have access to the actual message once inside SMF will get an error when they try to click on the message, but the fact that the message appears at all in the list is not acceptable, since some of these message titles are not "politically correct"

What are your questions on robots.txt?   I have a fairly comprehensive one being used on my site...
and, I am curious, what problems do you think exist in SMF that the developers don't think are important?


Your are right Kindred, private topics shouldn't be visible. I'll try to fix this. Help would be apreciated.

About indexing problems, I will use my forum to the example. Use this url: http://www.google.es/search?q=site%3Awww.davilac.net%2Fforo Take a look at page 3 for example. You will see this:
davilac.net/foro/index.php?topic=869.msg3831
davilac.net/foro/index.php?topic=869.new
davilac.net/foro/index.php?topic=181;prev_next=next
davilac.net/foro/index.php?topic=870.msg3872;topicseen

This URLs makes Google to think I have duplicated pages, so lots of pages will be considered as Supplemental Results.

I have an easy solution, but as far as SMF developers told me they didn't care about, I'll try to index better my forum.

One solution would be using HTML sitemaps and nofollow, and the other would be using robots.txt. I don't know for sure if I can prevent Google from indexing bad urls by robots.txt, because I will need advanced robots.txt syntax, a syntax only googlebot understand.


Kindred

Ok...  When I try to run sitemaps.php, I get an error:
(and google reports a similar error....)

Quote
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
--------------------------------------------------------------------------------

Only one top level element is allowed in an XML document. Error processing resource 'http://site/sitemaps...

<b>Warning</b>:  Cannot modify header information - headers already sent by (output started at /site/location...
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Davilac

I would need more information, but it seems your sitemaps is not connecting to your database. Maybe you didn't placed it in the correct folder or there's something I don't know.

What SMF version do you use?
Are you sure you access the file going to www.yourdomain.net/forum/sitemaps/sitemaps.php?
Do you use something like Mambo?
The other scripts work (like urllist.php or index.php)?

dschwab9

I don't think the private ones are a big deal in the URL list where the subject isn't shown, but, in home.php, I wouldn't want "Should we ban Ted?" being shown from my moderator board.


Kindred

Yes, I use mambo...

However, I have changed my files to redirect to the proper locations:

I added the following:


require_once ("../configuration.php");
global $mosConfig_absolute_path;
require_once ($mosConfig_absolute_path."/administrator/components/com_smf/config.smf.php");
global $smf_path;
require_once ($smf_path. "/SSI.php");

global $scripturl, $mosConfig_dbprefix, $mosConfig_live_site, $mosConfig_db, $db_name;
mysql_select_db($mosConfig_db);
$sql = "SELECT id FROM ".$mosConfig_dbprefix."menu WHERE link='index.php?option=com_smf'";
$result = mysql_query ($sql);
$row = mysql_fetch_array($result);
$myurl = $mosConfig_live_site . "/index.php?option=com_smf&Itemid=" . $row[0];
$scripturl = $myurl;
mysql_select_db($db_name);


and I replaced all calls for  --   $scripturl. '? --  with   $scripturl. '&

index.php and urllist.php work just fine...


I fixed the header error by commenting out the call for additional header information in sitemaps.php

but now, I am getting this one:
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
--------------------------------------------------------------------------------
A semi colon character was expected. Error processing resource 'http://askawitchcommunity.org/sitemaps/sitemaps.php';. Line...

  <loc>http://askawitchcommunity.org/index.php?option=com_smf&Itemid=62</loc>
-----------------------------...
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."


Kindred

#16
Mine?


http://www.google.com/search?q=site:askawitchcommunity.org
brings up only the domain itself... no forum posts

I still can not get Google to do two things.

1- It gives me a parse error with the sitemaps.php
2- it won't let me verify my site (that's not related to your utility, but related to issues with my 404 page)

either way, I don't think sitemaps.php is working with my site.

(note: I have a googlesitemap.xml for the MAMBO side of things, which does not include the forum posts, but that also has  a parse error, according to google.)
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

xcooling

post it here, and while you do that, maybe read up on how to code ? :P

Kindred

I beg your pardon?   XCooling, Was THAT directed at me?

Both my XMLs *SHOULD* be working, based on the php code.

BOTH of them give me the same error when I try to view them...
A semi colon character was expected. Error processing resource

Google tells me that both of them have parse errors...   yet they are BOTH properly constructed.

So, why can I not view them?   Why does google give me a parse error?

those are the questions.

And, FYI, I know how to code, both php and XML.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

xcooling

its a joke, hence the smiley, post the script and we can all help. (a few heads are better than one)

Advertisement: