Simple Machines Community Forum

SMF Support => Server Performance and Configuration => Topic started by: Roph on February 13, 2008, 08:11:37 PM

Title: Search index creation woes
Post by: Roph on February 13, 2008, 08:11:37 PM
As my forum has continued to grow and search performance continued to degrade, today I thought I'd recreate the (custom) search index, but this time on the large setting instead of moderate.

Removing the custom index worked like a charm, and I went though the processes of starting to create a large index.

As those of you with larger forums may know, this takes absolutely ages and is very boring :P, so I left it going and went off for a while. When I came back, I was met with the admin login screen. Maybe my session timed out or something? No worries anyway, I know SMF lets you resume, so I went back to Admin -> Search -> Search Method and clicked resume.

All I get (the same result on many tries) is just an offer to download index.php though  ::).

From here, I thought I'd try removing how far it'd gone in creating it and starting over. Removing it seemed to work ok, but now I'm at my problem position:

(https://www.simplemachines.org/community/proxy.php?request=http%3A%2F%2Fimg161.imageshack.us%2Fimg161%2F4232%2Fborkedhn1.th.png&hash=63526e009a91b943dc737a0dbe499b25b4b79bae) (http://img161.imageshack.us/img161/4232/borkedhn1.png)

This is what I'm greeted with now. Clicking on [remove custom index] does nothing except basically refresh the page. The option still stays there. Chosing [resume] just offers me a download of index.php, every time without fail.

Um,  :'(
Title: Re: Search index creation woes
Post by: Bigguy on February 15, 2008, 03:16:26 AM
If your asked to download an index.php file  that is a server side issue and your host should probably be contacted to see what is going on.
Title: Re: Search index creation woes
Post by: Roph on February 22, 2008, 06:43:01 AM
I'm the host, pretty much. I run my own VPS.

Sources/ManageSearch.php is a little cryptic under examination  :(

There is no "smf_log_search_words" table in my database.


Pretty much, I just want to restart creating the index, but SMF refuses to think that there isn't one, and will only let me either try to delete or resume a non-existant index (both of which obviously fail). Any help?  :'(

Title: Re: Search index creation woes
Post by: Ben_S on February 22, 2008, 07:02:27 AM
Empty any log_search tables and see what happens.
Title: Re: Search index creation woes
Post by: Roph on February 22, 2008, 09:15:14 AM
Tried both emptying and then dropping the log_search tables, got the same result as is in the first post each time. :(
Title: Re: Search index creation woes
Post by: Sarge on February 22, 2008, 12:59:36 PM
Go to Admin > Forum Maintenance and run the first 5 operations.

Title: Re: Search index creation woes
Post by: Roph on February 22, 2008, 01:41:52 PM
Done, still exactly the same  :(

Incase needed or relevant, phpinfo (http://slyph.org/info.php) / status (http://rmrk.net/status.php).
Title: Re: Search index creation woes
Post by: Roph on February 25, 2008, 02:43:50 PM
Bump?  :'(

I'm currently running on no index at all, and searches not only consume enormous amounts of my VPS' resources, they take over a minute for some queries  :(
Title: Re: Search index creation woes
Post by: Rumbaar on February 28, 2008, 05:22:26 PM
Have you checked to see if the Table is actually stored as Type = MyISAM for those particular tables?
Title: Re: Search index creation woes
Post by: Roph on February 28, 2008, 11:44:53 PM
Every single table in the DB is in MyISAM :(
Title: Re: Search index creation woes
Post by: Rumbaar on February 28, 2008, 11:53:24 PM
I'm not sure what is refers too, but what is the body field in the smf_messages type set too?  (TEXT?)
Title: Re: Search index creation woes
Post by: Roph on February 29, 2008, 12:01:42 AM
In the structure of the messages table under body it's set to "MEDIUMTEXT" for type.

Should I try changing it to TEXT?

[edit] I backed up incase anything went wrong, set the field to TEXT and now SMF let me generate a fulltext index, which is working perfectly =)
Title: Re: Search index creation woes
Post by: Rumbaar on February 29, 2008, 02:00:19 AM
Yes, well you've solved it.

Just some additional info for you. TEXT type has a maximum character set of 65535 bytes, were as MEDIUMTEXT has something like 4 million bytes.  Hence the error you got.  It was either of the two, but as the standard for the BODY is TEXT I didn't think that first and most people don't change that.

The problem you might find is that if any posts were above 65535 bytes they will now have been truncated to that maximum.

EDIT: Correct incorrect information.
Title: Re: Search index creation woes
Post by: Sarge on February 29, 2008, 02:12:39 AM
Quote from: Rumbaar on February 29, 2008, 02:00:19 AM
Yes, well you've solved it.

Just some additional info for you. TEXT type has a maximum character set of 65535 characters, were as MEDIUMTEXT has something like 4 million characters.  Hence the error you got.  It was either of the two, but as the standard for the BODY is TEXT I didn't think that first and most people don't change that.

The problem you might find is that if any posts were above 65535 character they will now have been truncated to that maximum.

Actually, I think the limits are in bytes. Some character sets (UTF-8 for example) can use more than 1 byte per character, depending on the character, and that would further decrease the limit of characters in a post.

However, it should be pretty safe to say that English texts (excluding non-standard characters like "smart quotes", em—dash etc.) take up 1 byte per character.
Title: Re: Search index creation woes
Post by: Rumbaar on February 29, 2008, 03:04:44 AM
Sorry about that, yes it's bytes and not characters, and yes utf8 characters can take up to three bytes depending on the language.

Thx Sarge.
Title: Re: Search index creation woes
Post by: Roph on February 29, 2008, 06:59:24 AM
My maximum allowed post size is set at 40,000 characters, so I think I'm safe in that regard. Thanks for the help & Info =)
Title: Re: Search index creation woes
Post by: Roph on February 29, 2008, 06:28:14 PM
Ok, here's an update. I just had to use a backup from the 27th, due to a severe table crash of the messages table today(29th). Attempts at repairing failed and actually brought down my entire VPS.

I'm running fine now on the 2 days old backup, with the field set back on "MEDIUMTEXT", and I hope you'll forgive me for not being too keen on trying out TEXT again.

I'd still like to just use the custom index option. My old custom index (on the moderate setting) performed acceptably I suppose, I was just hoping to shoot for better performance with a large index, though the creation failed. Read my first post again I guess. I'd like to just re-start creating a custom index. :(
Title: Re: Search index creation woes
Post by: Rumbaar on February 29, 2008, 06:33:58 PM
Well the BODY field in a default install of SMF is set to TEXT, so I don't see how that was the cause of your issue.  Do you know how and why it was changed to MEDIUMTEXT in your install?

With the field set higher than 65535 (which is MEDIUMTEXT) I don't know how you're going to get past that error message and how you were able to create one in the past.
Title: Re: Search index creation woes
Post by: Roph on February 29, 2008, 06:47:05 PM
The database has quite a story to tell - Invision powerboard originally, then phpBB2, then SMF 1.0 and finally SMF 1.1. It's been around the block, so to speak :P

I've never edited anything in the db "raw" before, so I can only assume it's a side effect of one of the conversions.

Is a custom index limited by TEXT/MEDIUMTEXT also then? Up until I removed it to attempt to create the large one, my moderate sized custom index was functioning properly, just showing signs of slowness on large queries or at busy times.
Title: Re: Search index creation woes
Post by: Rumbaar on February 29, 2008, 07:00:06 PM
Ah, I'm not sure how those conversion would affect that field.  I'm sure at one point it was MEDIUMTEXT from a previous forum software and was just retained.

I can't say I'm too familiar with process behind standard/custom indexing, so can't comment on the affect of each type.
Title: Re: Search index creation woes
Post by: Roph on March 11, 2008, 05:14:25 AM
Going to have to bump, is there no way I can re-start creating a custom index?

I'm back to my problem as I explained it in the first post, so please read that.
Title: Re: Search index creation woes
Post by: Rumbaar on March 15, 2008, 08:40:23 PM
I'm still at a loss to how you were able to create an index in the first place, with the field set to MEDIUMTEXT.  But then I'm not sure of how it might have turned from just TEXT to MEDIUMTEXT.  From the error message it indicated it is in fact limited to TEXT property.
Title: Re: Search index creation woes
Post by: Roph on March 20, 2008, 05:27:01 PM
Hmm.

Is there a query I can run on the database to make sure that none of my messages are over the size limitation of TEXT?

As said, my post size limit is 40,000, though I know it hasn't always been that. If there are any over 65535, the likely culprits will be scripts (game creation scripting language boards), and if they were cut off it would render them useless.
Title: Re: Search index creation woes
Post by: H on March 29, 2008, 10:21:22 AM
Hi Silverline,

Did you manage to sort this out?

If not I don't know if you can achieve this with just SQL but I imagine you should write a php batch script to grab all the messages and then count each one.
Title: Re: Search index creation woes
Post by: NightHorse on March 30, 2008, 05:12:34 AM
Hi, i have the same problem !
the table "smf_messages is in "text" " so I think there would be something else..

I was successfully using custom index, but when server crashed down (after restart) I had the same problems like silverline also I couldn't create it anymore. Now i'm using fulltext index but forum is always crashing.. sometimes cannot be run via watchdog and so we must go to telehouse and start it manually..
Title: Re: Search index creation woes
Post by: H on March 30, 2008, 07:15:37 AM
Quote
I was successfully using custom index, but when server crashed down (after restart) I had the same problems like silverline also I couldn't create it anymore. Now i'm using fulltext index but forum is always crashing.. sometimes cannot be run via watchdog and so we must go to telehouse and start it manually..

So you can't even recover the server via SSH? SMF shouldn't cause the whole server to crash or reboot :)
Title: Re: Search index creation woes
Post by: orange on May 03, 2008, 03:35:18 PM
Quote from: Silverline on February 13, 2008, 08:11:37 PM
I went though the processes of starting to create a large index.

As those of you with larger forums may know, this takes absolutely ages and is very boring :P, so I left it going and went off for a while. When I came back, I was met with the admin login screen. Maybe my session timed out or something?

I'm trying to create a search index at the moment, and yes it's painfully slow. But the worst part is that I can't just leave it overnight, because once an hour it stops and prompts for my admin password, though I'm still logged in (I presume the admin session is timing out or something).

Is there a fix for this? For me to remember to come back to the thing and type in my password every hour, in an operation that looks like it's going to take about 24 hours, is a painful prospect...

Alternatively, is there a query script to generate the index that I could run manually in MySQL while taking the forum offline, so that I don't have to keep waiting for this page to refresh itself 5000 times and finally get to 100%?
Title: Re: Search index creation woes
Post by: Rumbaar on May 03, 2008, 08:30:40 PM
I'm not sure if it will allow for the whole process to continue, but you can temporarily disable 'admin security' while you run this process and it shouldn't prompt for the admin password after the default timeout period for admin security.
Title: Re: Search index creation woes
Post by: Sarge on May 04, 2008, 01:51:59 AM
Quote from: orange on May 03, 2008, 03:35:18 PM
I'm trying to create a search index at the moment, and yes it's painfully slow. But the worst part is that I can't just leave it overnight, because once an hour it stops and prompts for my admin password, though I'm still logged in (I presume the admin session is timing out or something).

Is there a fix for this?

Quote from: Rumbaar on May 03, 2008, 08:30:40 PM
I'm not sure if it will allow for the whole process to continue, but you can temporarily disable 'admin security' while you run this process and it shouldn't prompt for the admin password after the default timeout period for admin security.

Yes, creation of a search index should complete automatically if "Disable administration security" is checked from Admin > Features and Options > Basic Features tab.

Once I was creating an index in a large forum and SMF prompted me for the admin password every hour or so; I just disabled it and the index creation finished by itself after that.
Title: Re: Search index creation woes
Post by: orange on May 04, 2008, 06:14:39 PM
Thanks guys, that should do the trick!