Vbulletin2 to SMF conversion Slowness and large messages.MYI creation

Started by Valgar, March 20, 2006, 05:20:51 PM

Previous topic - Next topic

Valgar

We've been working on converting an aging vbulletin 2 system on mysql 3.23 to SMF running on mysql 4.X over the past few weeks.

At the moment, we've migrated the vbulletin2 data to the new database, but have been running into some interesting issues running the vbulletin_to_smf conversion script.

1) The conversion seems excessively slow, the database is housed on a rather robust multiprocessor 3 ghz system, with plenty of memory, and excellent IO. Exports and imports of raw data are blindingly fast, as are accesses via the old forum software, but running the conversion script between the two instances runs for 2-3 days (before failing, explained in 2 ). Has anyone else encountered a slow conversion such as this? The current DB contains about 2.3 million posts from the old vbulletin 2 software. Doing a raw export/import of all this data takes approximately 10 minutes.

2) During the conversion process, the message conversion results in generation of an amazingly large smf_messages.MYI file. If left unattended, this will eventually fill up all available space before the conversion completes, thus killing the conversion (after about 2 days of thumb twiddling). To combat this, I run the following sql statement: mysql --host=XXXXXX --user=XXXXXX--password=XXXXXX--execute="optimize table smf_messages" XXXXXX This runs rather quickly, and reduces the smf_messages.MYI back to a reasonable level, of course it begins to rapidly increase in size again, so I loop the statement to run every 10 minutes or so (in 10 minutes, the MYI file is usually around 2-3 gigs in size, not horribly large, but I prefer to catch it before I have to pair it down from a 65gig monstrosity).

Any ideas would be greatly appreciated, a contingency plan is to manually execute each step in the script, and modify the messages conversion slightly to "keep state" so that I can run it over a period of a week or so, and just have it slurp in the newest messages without clearing the tables. With this method I could grind away during off hours, and eventually have a short downtime whilst the post message steps do the final conversion.

I'll go resume beating my head on the keyboard now. :D

Oldiesmann

I've never heard of a conversion taking two to three days. I would expect it to take several hours for a forum of that size, but not several days. Which version of SMF are you trying to convert to?

I would assume that the excessive size of the messages table is due to all the overhead created by inserting so much information in such a short period of time. Adding a query to optimize the table after each batch of messages gets converted should fix that though.
Michael Eshom
Christian Metal Fans

Valgar

Quote from: Oldiesmann on March 21, 2006, 01:47:51 PM
I've never heard of a conversion taking two to three days. I would expect it to take several hours for a forum of that size, but not several days. Which version of SMF are you trying to convert to?

We are converting from Vbulletin 2.X (2.2.7 to be precise), to SMF 1.0.6.

Quote from: Oldiesmann on March 21, 2006, 01:47:51 PM
I would assume that the excessive size of the messages table is due to all the overhead created by inserting so much information in such a short period of time. Adding a query to optimize the table after each batch of messages gets converted should fix that though.

Basically that is what has been done at this point, it keeps the MYI size down, which is a good thing.

I've noticed the mysql process seems to have high utilization during this process (the database is basically idle otherwise).

I've looked into adding thread concurrency to take advantage of the multiple cpus + hyperthreading, this at least seems to have increased the conversion speed, but only enough to put it shy of 2 days by 8 hours or so.

Attached is the my.cnf, I've been tempted to focus my attentions there now that the database is no longer sitting elsewhere on a crufty 3.23 install, I'm fairly new to mysql tweaking though, since I've come primarily from an oracle background..


max_connections=700
max_allowed_packet=32M
long_query_time=2
query_cache_limit=2M
query_cache_size=48M
query_cache_type=1
table_cache=640
safe-show-database
sort_buffer=2M
myisam_sort_buffer_size=4M

interactive_timeout=300
skip-locking
wait_timeout=100
connect_timeout=7
thread_cache_size=80
key_buffer_size=24M
join_buffer_size=4M


# Data files must be able to hold your data and indexes.
# Make sure that you have enough free disk space.

#innodb_data_file_path=ibdata1:100M:autoextend:max:500M

#
# Set buffer pool size to 50-80% of your computer's memory

innodb_buffer_pool_size=384M
innodb_additional_mem_pool_size=20M



#
# Set the log file size to about 25% of the buffer pool size
innodb_log_group_home_dir=/var/log/iblogs
innodb_log_arch_dir=/var/log/iblogs
#innodb_log_file_size=250M
#innodb_log_buffer_size=10M
innodb_flush_log_at_trx_commit=1

# Added dslavik 3/15/06 for import testing
thread_concurrency=8
log_slow_queries=/var/log/mysqld.slow.log
long_query_time=2




[mysql.server]
user=mysql
basedir=/var/lib
[mysqld]
#log=/var/log/mysqld-query.log
log-error=/var/log/mysqld-error.log
[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid


Valgar

I think I might have found the problem that causes the slowdown.

Initial migration of messages is very rapid, and calculating based of how quickly it proceeds gives me an estimate of about 5 hours to convert 2.3 million posts.

As the migration continues, the migration slows down, to the point of it only converting about 1000 posts every  minute and 40 seconds.

What I noticed is that after EACH iteration, ALTER TABLE foo ENABLE KEYS is called. This runs quickly when the smf_messages table is small, but as it grows it takes an inordinate amount of time. By disabling this call (and assuming I will be doing a final ENABLE KEYS call at the end), the migration speeds up to about 20 seconds for each 1000 posts (down from the 1:40).

Assuming this completes properly (I'll post up if it does), users migrating several million posts may want to consider disabling the ENABLE KEYS sql until the end of the migration.

Valgar

Update:

Removing the ALTER TABLE foo ENABLE KEYS during message conversion resulted in a speed increase on the order of 5 times, and allowed the conversion to complete. Afterwards I manually ran ALTER TABLE smf_messages ENABLE KEYS and all was well.

Another problem came up during Poll migration, while the query that runs will eventually return data, it appears that the script does nothing with that data, and hangs. After truncating the source poll tables, it proceeded through the rest of the migration without fail.

I'll look into the Poll conversion and see if there is anything I can do with it. Has anyone else run into problems at this stage? From the comments in the script it appears "experimental" for converting polls and poll vote counts.

Aside from the poll problem, the conversion worked well, and so far everything looks good on the converted forum. I'll run the conversion several more times to ensure I'm not missing anything.


Advertisement: