News:

Wondering if this will always be free?  See why free is better.

Main Menu

Search index creation woes

Started by Roph, February 13, 2008, 08:11:37 PM

Previous topic - Next topic

Roph

As my forum has continued to grow and search performance continued to degrade, today I thought I'd recreate the (custom) search index, but this time on the large setting instead of moderate.

Removing the custom index worked like a charm, and I went though the processes of starting to create a large index.

As those of you with larger forums may know, this takes absolutely ages and is very boring :P, so I left it going and went off for a while. When I came back, I was met with the admin login screen. Maybe my session timed out or something? No worries anyway, I know SMF lets you resume, so I went back to Admin -> Search -> Search Method and clicked resume.

All I get (the same result on many tries) is just an offer to download index.php though  ::).

From here, I thought I'd try removing how far it'd gone in creating it and starting over. Removing it seemed to work ok, but now I'm at my problem position:



This is what I'm greeted with now. Clicking on [remove custom index] does nothing except basically refresh the page. The option still stays there. Chosing [resume] just offers me a download of index.php, every time without fail.

Um,  :'(

Bigguy

If your asked to download an index.php file  that is a server side issue and your host should probably be contacted to see what is going on.

Roph

#2
I'm the host, pretty much. I run my own VPS.

Sources/ManageSearch.php is a little cryptic under examination  :(

There is no "smf_log_search_words" table in my database.


Pretty much, I just want to restart creating the index, but SMF refuses to think that there isn't one, and will only let me either try to delete or resume a non-existant index (both of which obviously fail). Any help?  :'(


Ben_S

Empty any log_search tables and see what happens.
Liverpool FC Forum with 14 million+ posts.

Roph

Tried both emptying and then dropping the log_search tables, got the same result as is in the first post each time. :(

Sarge

Go to Admin > Forum Maintenance and run the first 5 operations.


    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Roph

Done, still exactly the same  :(

Incase needed or relevant, phpinfo / status.

Roph

Bump?  :'(

I'm currently running on no index at all, and searches not only consume enormous amounts of my VPS' resources, they take over a minute for some queries  :(

Rumbaar

Have you checked to see if the Table is actually stored as Type = MyISAM for those particular tables?
"An important reward for a job well done is a personal sense of worthwhile achievement."

[ Themes ]

Roph

Every single table in the DB is in MyISAM :(

Rumbaar

I'm not sure what is refers too, but what is the body field in the smf_messages type set too?  (TEXT?)
"An important reward for a job well done is a personal sense of worthwhile achievement."

[ Themes ]

Roph

#11
In the structure of the messages table under body it's set to "MEDIUMTEXT" for type.

Should I try changing it to TEXT?

[edit] I backed up incase anything went wrong, set the field to TEXT and now SMF let me generate a fulltext index, which is working perfectly =)

Rumbaar

#12
Yes, well you've solved it.

Just some additional info for you. TEXT type has a maximum character set of 65535 bytes, were as MEDIUMTEXT has something like 4 million bytes.  Hence the error you got.  It was either of the two, but as the standard for the BODY is TEXT I didn't think that first and most people don't change that.

The problem you might find is that if any posts were above 65535 bytes they will now have been truncated to that maximum.

EDIT: Correct incorrect information.
"An important reward for a job well done is a personal sense of worthwhile achievement."

[ Themes ]

Sarge

Quote from: Rumbaar on February 29, 2008, 02:00:19 AM
Yes, well you've solved it.

Just some additional info for you. TEXT type has a maximum character set of 65535 characters, were as MEDIUMTEXT has something like 4 million characters.  Hence the error you got.  It was either of the two, but as the standard for the BODY is TEXT I didn't think that first and most people don't change that.

The problem you might find is that if any posts were above 65535 character they will now have been truncated to that maximum.

Actually, I think the limits are in bytes. Some character sets (UTF-8 for example) can use more than 1 byte per character, depending on the character, and that would further decrease the limit of characters in a post.

However, it should be pretty safe to say that English texts (excluding non-standard characters like "smart quotes", em—dash etc.) take up 1 byte per character.

    Please do not PM me with support requests unless I invite you to.

http://www.zeriyt.com/   ~   http://www.galeriashqiptare.net/


Quote
<H> I had zero posts when I started posting

Rumbaar

Sorry about that, yes it's bytes and not characters, and yes utf8 characters can take up to three bytes depending on the language.

Thx Sarge.
"An important reward for a job well done is a personal sense of worthwhile achievement."

[ Themes ]

Roph

My maximum allowed post size is set at 40,000 characters, so I think I'm safe in that regard. Thanks for the help & Info =)

Roph

Ok, here's an update. I just had to use a backup from the 27th, due to a severe table crash of the messages table today(29th). Attempts at repairing failed and actually brought down my entire VPS.

I'm running fine now on the 2 days old backup, with the field set back on "MEDIUMTEXT", and I hope you'll forgive me for not being too keen on trying out TEXT again.

I'd still like to just use the custom index option. My old custom index (on the moderate setting) performed acceptably I suppose, I was just hoping to shoot for better performance with a large index, though the creation failed. Read my first post again I guess. I'd like to just re-start creating a custom index. :(

Rumbaar

Well the BODY field in a default install of SMF is set to TEXT, so I don't see how that was the cause of your issue.  Do you know how and why it was changed to MEDIUMTEXT in your install?

With the field set higher than 65535 (which is MEDIUMTEXT) I don't know how you're going to get past that error message and how you were able to create one in the past.
"An important reward for a job well done is a personal sense of worthwhile achievement."

[ Themes ]

Roph

The database has quite a story to tell - Invision powerboard originally, then phpBB2, then SMF 1.0 and finally SMF 1.1. It's been around the block, so to speak :P

I've never edited anything in the db "raw" before, so I can only assume it's a side effect of one of the conversions.

Is a custom index limited by TEXT/MEDIUMTEXT also then? Up until I removed it to attempt to create the large one, my moderate sized custom index was functioning properly, just showing signs of slowness on large queries or at busy times.

Rumbaar

Ah, I'm not sure how those conversion would affect that field.  I'm sure at one point it was MEDIUMTEXT from a previous forum software and was just retained.

I can't say I'm too familiar with process behind standard/custom indexing, so can't comment on the affect of each type.
"An important reward for a job well done is a personal sense of worthwhile achievement."

[ Themes ]

Advertisement: