SEO Indexing Tools: SMF Sitemaps (SMF 1.0.5-1.1RC2)

Started by Davilac, December 07, 2005, 02:28:36 PM

Previous topic - Next topic

Prasad007

Quote from: statornic on March 07, 2006, 07:38:20 PM
Prasad007,
I can offer you a sugestion. Don't know yet if it is successful. When you submit the sitemap in your google account try with this file in URL termination, like:

http://forum.crestini.com/sitemaps/sitemaps.php

Do not submit with .../sitemaps/home.php or .../sitemaps/index.php
Hope this will work.

i did submit http://forum.crestini.com/sitemaps/sitemaps.php

Enc0der

Quote from: Kindred on March 07, 2006, 12:50:26 PM
Avatar... I don't think google cares (or will even notice) as long as the sitemap points to valid forum objects)

well, it's your (his) choice..
just don't be too surprised when your site will get banned from google..

You should REALLY learn at least the *basics* of SEO before posting so irresponsible stuff like that.

Kindred

Enc0der,

please don't be so dramatic or make assumptions....   I know quite a bit about SEO (including the fact that, despite "common wisdom" to the contrary, the supposed SEF links don't make very much difference at all. Bots still crawl the sites with no problem and while your rank may be slightly improved, it doesn't matter as much as everyone seems to think. Heck, look at SMF, here...)
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Enc0der

hmm... is that so? ::)
QuoteDon't use "&id=" as a parameter in your URLs, as we don't include these pages in our index.
http://www.google.com/webmasters/guidelines.html

QuoteQuality Guidelines - Specific recommendations:

    * Avoid hidden text or hidden links.
http://www.google.com/webmasters/guidelines.html

And that's from google itself.
Did I remember to mention "basic" ?

Kindred

hmmm....   I don't see &id= anywhere on my site, except for profile information...

That is in the "recommendation" section, not the rules section.... and gogle won't ban you for it.
And the very concpet of a sitemap is actually contrary to what google "recommends" since it is a page specifically designed for robots to catalogue your site.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

jonks

#165
I can back that up. I have been in SEO for 5 years and know several poeple who are involved in developing algorithms for search engines and it is fact that Google will penalise your website for hiding text. I know because it has happend to me on several occasions.

Hiding text is one of the biggest mistakes any web site can make. And a Google penalty can last for months, or even be permenant.

On a brighter note though, this is a great mod for SMF. The Google site map works very well.

The HTML site map on the other hand is not crawlable by most search engines because it still contains the session ID's in all of the links. At best you will end up with a few supplimental listings.

The change to this mod that is really needed is to create a mod that converts the sitemap into plain HTML with no session ID's at all. That would be very good. I really dont see why a site map needs to be in the style of the site either, surely that just makes the page larger for the search engines.

croniksoft

hello my friend,i have uploaded this file last month,i have try everything to google to index my forum but nothing works,it index myt main page but thats about it,i dosent index the forum,


i have smf rc2 with mkportal install,

my website:www.cis-elite.org
mysitemap:http://www.cis-elite.org/forum/sitemaps/sitemaps.php


thanx for anyhelp


ps:yo se espanol dav
i have a dream and you could be part of it


statornic

Your links doesn't work at all.

http://www.cis-elite.org/forum/sitemaps/home.html <- broken link
http://www.cis-elite.org/forum/sitemaps/0.html <- tambien
http://www.cis-elite.org/forum/sitemaps/1.html <- tambien
http://www.cis-elite.org/forum/sitemaps/2.html <- etc.

I don't know why. Do you include somehow this sitemap in an index with "include" or "require" ?

Prasad007

im having a problem
when i used this mod earlier that time it worked
the url's used to be something.html for all topics and stuff
now its just normal topicno.0 as usual
the htaccess hasnt worked
and i have waited over a month now
earlier it happened 4 or 5 days
this is what i get in google:


please help!

croniksoft

i dont know,i put everything in the right spot,

what should i do,how do i make 0.html ?

:P is crazzy
i have a dream and you could be part of it


jonks

#170
Surely this could all be made a whole lot less complicated.

The google sitemap works fine.

You Yahoo URL list does not work because Yahoo will only accept it if the file extention is .txt   However, Yahoo will read the Google sitemap so submit that instaed.

The HTML sitemap doesn't work because of the session Id's in the URL's. I dont care how many people think that Google can crawl those links, It can't. The static URL's that SMF creates using the .htaccess file are not crawlable because of the session Id's. This is How Google and other search engines see the URL's www.forum.com/index.php?PHPSESSID=9527fe761b85cf3b43376d0d18154073&board=117.0 [nofollow]  and because the sitemap created for this mod produces those kind of URL's it is not worth having because it is no more easy to crawl than the site itself.

Whats needed is a way to turn off session ID's for non members, or a sitemap with HTML links and NO session ID's. I'm no PHP expert so i couldn't do it, but if someone did it would solve so many peoples problems.

All of the other Forum scripts have mods to remove session ID's for non members, I so wish SMF did too.

If you doubt what I say about Google not being able to crawl the links in SMF then consider this....

SMF forums main url http://www.simplemachines.org/community/ is a PR5 page. Google will most likely visit this page every day, yet all of the first level boards are not in the Google index. Take the board "General Discussion and Feedback" for instance, Google has no cache of it at all... http://72.14.203.104/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2006-09,GGLG:en&q=cache:http%3A%2F%2Fwww.simplemachines.org%2Fcommunity%2Findex.php%3Fboard%3D2.0 [nofollow] If it didnt have the session ID in it and Google actually read this URL as http://www.simplemachines.org/community/index.php?board=2.0 it would be snapped up within a day.

The reason the SMF website gets loads of visitors is because we all have SMF forums with links pointing to the SMF website. This gives SMF high pagerank with some nice relevant anchor text in the links, so SMF website features well for relevant searches.

We however are not in this position and without some changes to SEO on SMF, it is very doubtful we will do well on Google or any other search engine for that matter.

Kindred

Jonks...

I'm not really sure what you are talking about since I don't actually have SessionIDs in my urls here at SMF nor at my site...   and they are not in my Sitemap either...
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Prasad007

Quote from: jonks on March 10, 2006, 11:51:16 AM
Surely this could all be made a whole lot less complicated.

The google sitemap works fine.

You Yahoo URL list does not work because Yahoo will only accept it if the file extention is .txt   However, Yahoo will read the Google sitemap so submit that instaed.

The HTML sitemap doesn't work because of the session Id's in the URL's. I dont care how many people think that Google can crawl those links, It can't. The static URL's that SMF creates using the .htaccess file are not crawlable because of the session Id's. This is How Google and other search engines see the URL's www.forum.com/index.php?PHPSESSID=9527fe761b85cf3b43376d0d18154073&board=117.0  and because the sitemap created for this mod produces those kind of URL's it is not worth having because it is no more easy to crawl than the site itself.

Whats needed is a way to turn off session ID's for non members, or a sitemap with HTML links and NO session ID's. I'm no PHP expert so i couldn't do it, but if someone did it would solve so many peoples problems.

All of the other Forum scripts have mods to remove session ID's for non members, I so wish SMF did too.

If you doubt what I say about Google not being able to crawl the links in SMF then consider this....

SMF forums main url http://www.simplemachines.org/community/ is a PR5 page. Google will most likely visit this page every day, yet all of the first level boards are not in the Google index. Take the board "General Discussion and Feedback" for instance, Google has no cache of it at all... http://72.14.203.104/search?sourceid=navclient&ie=UTF-8&rls=GGLG,GGLG:2006-09,GGLG:en&q=cache:http%3A%2F%2Fwww.simplemachines.org%2Fcommunity%2Findex.php%3Fboard%3D2.0 If it didnt have the session ID in it and Google actually read this URL as http://www.simplemachines.org/community/index.php?board=2.0 it would be snapped up within a day.

The reason the SMF website gets loads of visitors is because we all have SMF forums with links pointing to the SMF website. This gives SMF high pagerank with some nice relevant anchor text in the links, so SMF website features well for relevant searches.

We however are not in this position and without some changes to SEO on SMF, it is very doubtful we will do well on Google or any other search engine for that matter.
thanks for the info! :)
btw, simplemachines.org is Page Rank 7 ;)

jonks

Quote from: Kindred on March 10, 2006, 12:52:46 PM
Jonks...

I'm not really sure what you are talking about since I don't actually have SessionIDs in my urls here at SMF nor at my site...   and they are not in my Sitemap either...

Erm.. Yes you do.

Put any of your pages into the spider simulator at Webconfs and you will see exactly how spider viws your links http://www.webconfs.com/search-engine-spider-simulator.php [nofollow] try putting in http://www.simplemachines.org/community/




jonks

Quote from: Prasad007 on March 10, 2006, 01:05:42 PM
thanks for the info! :)
btw, simplemachines.org is Page Rank 7 ;)


If you read my post I was not talking about the main page at simple machines.org [nofollow], which as you say IS a PR7...

I was talking about the forum main page at http://www.simplemachines.org/community/ which IS a PR5

You don't have to take what I say seriously and you can brush me off thinking I'm talking bull all you like, but I'm just trying to help. I've been looking into it and I'm just showing you my findings.

Perhaps I shouldn't bother?

TwinsX2Dad

Quote from: jonks on March 10, 2006, 01:46:48 PMPerhaps I shouldn't bother?

You should bother, oh preacher of the church of the painful truth.

This is a very real issue and needs to be addressed if it is ever to be fixed.

jonks

Quote from: Kindred on March 10, 2006, 12:52:46 PM
Jonks...

I'm not really sure what you are talking about since I don't actually have SessionIDs in my urls here at SMF nor at my site...   and they are not in my Sitemap either...


Here is an small example of how search engines see the URL's linked to from the main FORUM page at simplemachines.org...

http://www.simplemachines.org/community/index.php?PHPSESSID=61af2bbc7ef910951990a7f9a9644c04&board=37.0
http://www.simplemachines.org/community/index.php?PHPSESSID=61af2bbc7ef910951990a7f9a9644c04&board=72.0
http://www.simplemachines.org/community/index.php?PHPSESSID=61af2bbc7ef910951990a7f9a9644c04&board=59.0

The search engines see this and do not crawl it - PHPSESSID=61af2bbc7ef910951990a7f9a9644c04&

The reason Google will not index these is that there are actually hundreds of Googlebots all trawling the net at the same time. Each of these bots will get a different Session ID when visiting an SMF forum which will cause the Googlebots to index the page several times thinking it is a diffent page each time. So Google ingnors them.

From the words of Googleguy over at Webmasterworld...

"So what's the problem with a session id, and why doesn't Googlebot crawl them? Well, we don't just have one machine for crawling. Instead, there are lots of bot machines fetching pages in parallel. For a really large site, it's easily possible to have many different machines at Google fetch a page from that site. The problem is that the web server would serve up a different session-id to each machine! That means that you'd get the exact same page multiple times--only the url would be different. It's things like that which keep some search engines from crawling dynamic pages, and especially pages with session-ids.

Google can do some smart stuff looking for duplicates, and sometimes inferring about the url parameters, but in general it's best to play it safe and avoid session-ids whenever you can"


And from Google's own webmaster guidelines page...

"Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page."

Some more information on the subject...

http://www.stephanspencer.com/archives/2004/06/25/spiders-like-googlebot-choke-on-session-ids/ [nofollow]



coolparth

Hello i run SMF-Mambo integrated with hacks from joomla hacks. Is this mod available for a wrapped mode SMF ?

Parth

Kindred

I know it's a long thread... but you really should read through it... especially since right in the middle was a question rgearding this very topic and a post with the modified sitemap attached....

http://www.simplemachines.org/community/index.php?topic=59676.msg457116#msg457116
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

statornic

In the recomandations I've read that if is possible to put into template the link for 0.html and 1.html is very good. What's happening if I put the entire sitemap generated links (0-15) ?

http://forum.crestini.com/index.php?action=search

Advertisement: