SMF made mysql stuck

Started by b3dm4n, March 23, 2010, 06:00:06 AM

Previous topic - Next topic

b3dm4n

We need help tweak mysql settings to meets SMF requirements
We recently move our forum to new dedicated server, its an 8 cores Xeon with 4Gb Ram
We planned to use it for multiple sites, so we installed ISPconfig3 on Ubuntu 9.10.

Currently the server only have our forum in it, but when we bring it out of maintenance mode, the server became slow and keep getting slower and it took a long time to run command via ssh. when I ran mtop it looked like some queries never end. the highest time is 3000 secs and still counting.

so whenever we brought smf out of maintenance mode, we cant browse to the server, not even phpmyadmin or the ISPConfig control panel, or even just to show status.php.

but, the top and htop command is not showing any load on the processor, and the memory usage is only at 55%

we tried the steps on the checklist with no luck, even after we convert to innoDB, therefore we altered back to myISAM

What is wrong with our forum?
Is there mysql settings that we should tweak?
our current my.cnf settings:


connect_timeout         = 10
interactive_timeout     = 20
join_buffer_size        = 2M
read_buffer_size        = 2M
sort_buffer_size        = 3M
key_buffer_size         = 256M
max_allowed_packet      = 16M
thread_stack            = 128K
thread_cache_size       = 128
read_rnd_buffer_size    = 524288
bulk_insert_buffer_size = 8M
myisam-recover         = BACKUP
myisam_sort_buffer_size = 64M
max_connections        = 400
max_user_connections   = 300
table_cache            = 1800
tmp_table_size         = 64M
max_heap_table_size     = 64M
max_allowed_packet      = 16M
max_connect_errors      = 10M
thread_concurrency     = 4
open_files_limit       = 1600
wait_timeout           = 100
low_priority_updates   = 1
concurrent_insert      = 2
default-storage-engine         = MyISAM

#
# * Query Cache Configuration
query_cache_limit       = 3M
query_cache_size        = 192M
query_cache_type        = 1
query_prealloc_size     = 16384
query_alloc_block_size  = 16384
query_cache_min_res_unit = 2K
long_query_time         = 2


additional detail about our smf forum
version: 1.1.11
4519860 Posts in 22371 Topics by 140926 Members

here are result of status.php when smf in maintenance mode (I dont think it shows the real usage)

Connections per second: 3.6293
Kilobytes received per second: 14.9742
Kilobytes sent per second: 125.99
Queries per second: 51.9093
Percentage of slow queries: 0.0007
Opened vs. Open tables:
(table_cache) 3.6469 (should be <= 80)
Key buffer read hit rate:
(key_buffer_size) 0.0017 (should be <= 0.01)
Key buffer write hit rate:
(key_buffer_size) 0.0146 (should be <= 0.5)
Thread cache hit rate:
(thread_cache_size) 30.9318 (should be >= 30 )
Thread cache usage:
(thread_cache_size) 0.1172 (should be >= 0.7 and <= 0.9)
Temporary table disk usage:
(tmp_table_size) 0.1844 (should be <= 0.5)
Sort merge pass rate:
(sort_buffer) 0 (should be <= 0.001)
Query cache enabled:
(query_cache_type) 1 (should be >= 1 and <= 1)
Query cache miss rate:
(query_cache_limit) 0.4766 (should be <= 0.5)
Query cache prune rate:
(query_cache_size) 0 (should be <= 0.05)


please enlight us ... :)


PS. Sorry for my english, its not our native language

edited 1 : mark as solved

chep

So you already tried tuning with ?

http://blog.mysqltuner.com/

What does your slow queries log say ?

b3dm4n

Quote from: chep on March 23, 2010, 04:22:11 PM
So you already tried tuning with ?

http://blog.mysqltuner.com/

What does your slow queries log say ?

We tuned it based on mysqltuner and tuning-primer script recomendation, and all said nothing is missconfigure.

here is part of our slow-log queries, but I don't see the one that still running (600 secs now and still counting up), I guess it wont be in the log if its not finished yet

# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 188.097176  Lock_time: 0.000161 Rows_sent: 253  Rows_examined: 253
SET timestamp=1269401443;
SELECT variable, value
                        FROM `c1lautan2010`.smf_settings;
# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 181.704412  Lock_time: 0.000135 Rows_sent: 253  Rows_examined: 253
SET timestamp=1269401443;
SELECT variable, value
                        FROM `c1lautan2010`.smf_settings;
# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 75.040554  Lock_time: 0.000146 Rows_sent: 253  Rows_examined: 253
SET timestamp=1269401443;
SELECT variable, value
                        FROM smf_settings;
# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 183.286452  Lock_time: 0.000102 Rows_sent: 253  Rows_examined: 253
SET timestamp=1269401443;
SELECT variable, value
                        FROM smf_settings;
# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 148.983798  Lock_time: 0.000179 Rows_sent: 253  Rows_examined: 253
SET timestamp=1269401443;
SELECT variable, value
                        FROM `c1lautan2010`.smf_settings;
# Time: 100324 10:30:46
# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 27.401758  Lock_time: 0.000355 Rows_sent: 72  Rows_examined: 1224
SET timestamp=1269402014;
SELECT COUNT(*) AS 'POSTCOUNT', smf_members.ID_MEMBER, smf_members.memberName, smf_membergroups.onlineColor
            FROM (smf_messages INNER JOIN smf_members ON smf_messages.ID_MEMBER = smf_members.ID_MEMBER) LEFT JOIN smf_membergroups ON smf_members.ID_GROUP = smf_membergroups.ID_GROUP
            WHERE (((smf_messages.ID_TOPIC)=2172))
            GROUP BY smf_members.ID_MEMBER ORDER BY POSTCOUNT DESC;
# User@Host: c1mykapal[c1mykapal] @ localhost []
# Query_time: 112.653830  Lock_time: 0.000348 Rows_sent: 20  Rows_examined: 608
SET timestamp=1269402221;
SELECT
                                t.ID_TOPIC, t.numReplies, t.locked, t.numViews, t.isSticky, t.ID_POLL,
                                IFNULL(lt.ID_MSG, IFNULL(lmr.ID_MSG, -1)) + 1 AS new_from,
                                t.ID_LAST_MSG, ml.posterTime AS lastPosterTime, ml.ID_MSG_MODIFIED,
                                ml.subject AS lastSubject, ml.icon AS lastIcon, ml.posterName AS lastMemberName,
                                ml.ID_MEMBER AS lastID_MEMBER, IFNULL(meml.realName, ml.posterName) AS lastDisplayName,
                                t.ID_FIRST_MSG, mf.posterTime AS firstPosterTime,
                                mf.subject AS firstSubject, mf.icon AS firstIcon, mf.posterName AS firstMemberName,
                                mf.ID_MEMBER AS firstID_MEMBER, IFNULL(memf.realName, mf.posterName) AS firstDisplayName,
                                LEFT(ml.body, 384) AS lastBody, LEFT(mf.body, 384) AS firstBody, ml.smileysEnabled AS lastSmileys,
                                mf.smileysEnabled AS firstSmileys
                        FROM (smf_topics AS t, smf_messages AS ml, smf_messages AS mf)
                                LEFT JOIN smf_members AS meml ON (meml.ID_MEMBER = ml.ID_MEMBER)
                                LEFT JOIN smf_members AS memf ON (memf.ID_MEMBER = mf.ID_MEMBER)
                                LEFT JOIN smf_log_topics AS lt ON (lt.ID_TOPIC = t.ID_TOPIC AND lt.ID_MEMBER = 131454)
                                LEFT JOIN smf_log_mark_read AS lmr ON (lmr.ID_BOARD = 106 AND lmr.ID_MEMBER = 131454)
                        WHERE t.ID_BOARD = 106
                                AND ml.ID_MSG = t.ID_LAST_MSG
                                AND mf.ID_MSG = t.ID_FIRST_MSG
                        ORDER BY isSticky DESC, ID_LAST_MSG DESC
                        LIMIT 0, 20;



Btw, I altered back some of tables to innoDB as stated in http://www.simplemachines.org/community/index.php?topic=50217.0, and smf still dragging the server slow, htop and top shows no load on the server :(

chep

Strange. I run some of those same queries through phpMyadmin and they return just a few hundred rows in short time.

'SELECT variable, value
FROM smf_settings
LIMIT 0 , 30;"

Showing rows 0 - 29 (383 total, Query took 0.0004 sec)

What version of SMF do you have? Mods?

If you run something like that query above it should really be fast. My guess is something could be wrong with the database. I wouldn't rule out a bad script either. I would double check all your files (perhaps even Load.php and all others) are up to date. However if you disable SMF and you can run that query above and it returns in less than a second (which is a long time) then your database might be ok. Or least that table anyway... Do you regularly optimize and also repaire your database tables?


Good luck.

b3dm4n

strange?? exactly!
thats why I post this thread :(

I 've never repaired or optimized the database before, yesterday I did them all: repair & optimize, and yet still no change, the smf still make mysql slow. last night I leave the forum on, and this morning I run mtop, the top ten of the list have more than 3000 secs and still counting.
is there any way to kill those overtime queries?

I'm using SMF 1.1.11
and the mods installed :
1.   HacksPark shoutBox   1.2.1   
2.   Ad Managment   2.3   
3.   SMF Arcade   2.0.17   
4.   Enhanced Quick Reply Box 2.0.2   2.0.2   
5.   Aeva ~ Auto-Embed Video & Audio   6.0.71   
6.   Pretty URLs   1.0RC   
7.   PageDropdown   1.1   
8.   Active Members In Topic   1.8   
9.   SMF 1.0.17 / 1.1.9 / 2.0 RC1 Update   1.0   
10.   SMF 1.0.18 / 1.1.10 / 2.0 RC1-2 Update   1.1   
11.   SMF 1.0.19 / 1.1.11 Update   1.0

when I ran query like you did, but in maintenance mode:
'SELECT variable, value
FROM smf_settings
LIMIT 0 , 30;"
Showing rows 0 - 29 (~301 total, Query took 0.0001 sec)

but when I run it after 2 minutes I put 0 in maintenance mode, my browser status showing "waiting for server" for more than 2 minutes and return :
Showing rows 0 - 29 (~301 total, Query took 58.6538 sec)

why?? please enlight me .. what to check :(

青山 素子

Seriously, something is up with MySQL. Doing a simple select of the settings table, which the above says is 253 rows, should never take over 10 seconds, let alone over 70.

The only things I can think of is a lot of disk contention, high server load, or severe MySQL misconfiguration.
Motoko-chan
Director, Simple Machines

Note: Unless otherwise stated, my posts are not representative of any official position or opinion of Simple Machines.


b3dm4n

Quote from: Motoko-chan on March 24, 2010, 11:26:59 PM
The only things I can think of is a lot of disk contention, high server load, or severe MySQL misconfiguration.

how to examine the disk contention?
the server load is nothing, nothing using the processor when I run htop
I posted my mysql configuration at the top of this thread, can u point me to the right direction?

chep

What does status.php look like?

b3dm4n

Quote from: chep on March 25, 2010, 12:36:14 AM
What does status.php look like?


screenshot of my status.php result attached

Ensiferous

If you have root access then login and issue a "iostat 1" command and check the iowait percentage, this is the percentage of time your CPU spend waiting on the HDD. Also the output of "vmstat 1" will tell you actual disk usage.
My Latest Blog Post: Debugging Nginx Errors

chep

#10
I wasn't interested in that part of status.php. I was more interested in the section at the top where it gives swap and memory usage. Anyway I am going to have to side with Motoko. I believe there is either a disk issue or your MySQL is configured improperly. That tip above looks like a good start. I would also consider starting with a "new" mysql configuration file in case there is an entry in the existing one which is unexpectedly causing the problem. You can always back up the existing one..

Something like that

Quote from: b3dm4n on March 23, 2010, 06:00:06 AM
additional detail about our smf forum
version: 1.1.11
4519860 Posts in 22371 Topics by 140926 Members

Please apply for access to the Big Forum group. You can do that by going here.

b3dm4n

Quote from: Ensiferous on March 25, 2010, 08:06:23 AM
If you have root access then login and issue a "iostat 1" command and check the iowait percentage, this is the percentage of time your CPU spend waiting on the HDD. Also the output of "vmstat 1" will tell you actual disk usage.

here is result of iostat 1 when the smf running:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.19    0.00    0.19    9.32    0.00   90.30

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.00         0.00         0.00          0          0
sdb             405.00      8880.00         0.00       8880          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.43    0.00    0.43   21.96    0.00   77.17

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.00         0.00         0.00          0          0
sdb             388.00      6792.00        72.00       6792         72

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.28   13.73    0.00   85.99

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.00         0.00       256.00          0        256
sdb             381.00      7456.00       112.00       7456        112


and here is result of vmstat 1:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
2  9    212 1690196  15364 1174556    0    0   728   331   75  109  2  1 88 10
0 10    212 1689608  15360 1171700    0    0  4588     0  790 1100  2  0 83 15
1 10    212 1687816  15388 1171788    0    0  3924    56  791 1133  0  1 84 15
0 11    212 1687272  15400 1169796    0    0  4076    20  832 1204  1  1 88 10
1 12    212 1687048  15416 1169456    0    0  4332    20  712  758  0  1 84 14
0 12    212 1687692  15444 1169100    0    0  3624    72  777 1104  1  0 90  9
0 12    212 1687584  15460 1169512    0    0  3640     0  629  713  2  1 79 18
1 12    212 1686620  15492 1169448    0    0  4396    52  868 1174  0  0 89 11
6 13    212 1686984  15512 1167572    0    0  4908     0  667  876  1  0 89 10


what can you tell from the result? I'm new to these commands :)

b3dm4n

Quote from: chep on March 25, 2010, 11:04:59 AM
I wasn't interested in that part of status.php. I was more interested in the section at the top where it gives swap and memory usage. Anyway I am going to have to side with Motoko. I believe there is either a disk issue or your MySQL is configured improperly. That tip above looks like a good start. I would also consider starting with a "new" mysql configuration file in case there is an entry in the existing one which is unexpectedly causing the problem. You can always back up the existing one..

Sorry for misunderstood your request, here I attached the latest status.php I just capture few minutes ago.


I already tried to replace my.cnf with the original one, and still no effect :(

b3dm4n

my friend suspected that mysql is still indexing the tables on the background, therefore I tried to drop all indexes from all tables, and found 1 index in smf_messages table is corrupted with error index doesn't exist. I was kinda happy to found that, maybe this index is the source of our problem, after re-creating all index, the query still took a long time to finished, but I notice the longest is taking no more than 1500 secs, so it helped a bit.

still ... we want a fast forum
I run all command that phpmyadmin offer, analyzed, check, repair and optimize, but I still suspect that the database still contain corrupted index.
I want to drop the primary keys and recreate them just like the index tables, but phpmyadmin said it was not allowed. how do we drop and recreate primary keys? i suspect there are still corrupt database in those the primary keys index

I think we are getting somewhere



chep

#17
I would restore my database from a backup if possible.

Did you see anything suspicious in the mysql logs?

b3dm4n

our last backup was 2 weeks old, since then the data has been change a lot, and we don't want to dissapoint our members who has posts for the last 2 weeks, thats why we are trying to fix the current database

we disable bin log because we keep getting warning of statement logging or something, I forgot, I'll enable it on monday to see the exact warning.

mysql.err and mysql.log are empty.

chep

If you restore from a backup and it works "fine" then you know it's your db. It would pretty much rule out "code" bugs, sql tuning, etc.

You could always restore the known good backup to a second db and test it.

Advertisement: