News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

Unread posts and slow queries

Started by cookdandbombd, November 26, 2024, 07:00:20 PM

Previous topic - Next topic

cookdandbombd

I've been working a lot on optimising my forum and am finding the "unread topics" feature is showing up a lot in the mysql slow query log. I do have load balancing turned on to shut it down when needed.

QuoteRows_examined: 677359

Does the query have to do all that work? Is there any way to limit it?

SMF 2.1.4

shawnb61

#1
Not a lot of details to work on here, but I suspect the "Mark Boards as Read" scheduled task may help. 

It is not enabled by default, because it deletes a lot of old (likely unused) information.

It can be a LOT of rows...  SMF keeps track of what everyone has read & what they haven't read - all the way down to the board & topic level.  So, yeah, that can be a LOT of data...  Close to every user * every board...  The "Mark Boards as Read" scheduled task mimics folks going in & marking their boards as read, which cleans up a surprising number of records.

And further, for uses who haven't been around in a while, like a year or more, face it, everything is new to them.  So the "Mark Boards as Read" function just deletes all that read/unread history for those folks.  Why keep any of that detail around for them...  When they return, everything will appear unread.

Timeframes can be adjusted.  Look at the "Read Logs" section under Admin | Maintenance | Logs | Log Settings.

If you have a lot of cleanup to do, you may need to increase the # of users to process at a time & run it manually a bunch of times.  There can easily be millions of rows of data to clean.
A question worth asking is born in experience & driven by necessity. - Fripp

cookdandbombd

Thank you so much!  That's phenomenal, I'll get cracking on that now. 

While I'm here, thanks also for the brilliant info you've given about blocking bots in other threads, it's been an enormous help. 

cookdandbombd

#3
As I forgot to mark this solved (sorry), I will give an update if that's okay!

I'm still having some problems, but thankfully have some help now. So I ran mysqltuner and we checked for missing indexes. 

Quoteskip-name-resolve=ON
    join_buffer_size (> 256.0K, or always use indexes with JOINs)
    key_buffer_size=0
    innodb_redo_log_capacity should be (=32M) if possible, so InnoDB Redo log Capacity equals 25% of buffer pool size.
    innodb_log_buffer_size (> 16M)

is "skip-name-resolve" okay to set on?

We are doubling the buffer size for now.

I will likely convert to InnoDB, I've found a lot of old threads about that on here to use as a reference.

This is the sort of thing I'm getting in my slow queries log at the minute, what does it do please?

# Query_time: 13.839683  Lock_time: 0.000009 Rows_sent: 1  Rows_examined: 59053
# Time: 2025-02-05T23:02:55.413976Z
                                ORDER BY date DESC;

                                WHERE t.id_topic IN(94173,68296,68445,94191,94325,94142,96764,97117,97152,97826,99723,99688,101284,101743,102727,102950,103819,104911,104962,105023,105171,105793,106159,106155,106419,106444,106499,106917,107274,107288,107275,107848,107902,107912,107938,108248,108412,108466,108057,108790,108847,108909,108860,109073,109559,111629,112224,112235,112290)
                                        ON t.id_first_msg = m.id_msg
                INNER JOIN smf_messages AS m
                                        ON t.id_board = b.id_board
                INNER JOIN smf_boards AS b
                                FROM smf_topics AS t
SELECT t.id_first_msg AS id_first_msg , m.poster_time AS date

Seems like maybe it's grabbing a topic listing, but not all are from the same subforum. Any ideas why it would be slow?

Edit: for clarity, the main issues I have are that the forum goes through periods of slowness, and I'll get a white 503 error screen quite a few times during a day.  The hosts say it's in the software chain and nothing to do with them.  The guy helping me has identified the database as being the issue, it seems.

shawnb61

#4
I am not familiar with skip-name-resolve.  I just did a little reading, & based on my limited understanding, I don't think it would either hurt or help...  SMF is just going to connect thru your webserver - i.e., only one host for your db server to lookup. 

I do have a utility that will help confirm if your DB is complete from an SMF standpoint:
https://raw.githubusercontent.com/sbulen/sjrbTools/refs/heads/master/smf_db_compare.php

Just download & run the utility from your forum root.  In general, things highlighted in green are adds (usually mods, not a problem), things highlighted in yellow are changes (usually things like collation changes, & not a problem), things highlighted in red are removals (and are almost always a  problem...).  If you see items in red in the list of indexes, that probably requires attention.

The first two questions that come to mind are:
 - Mods???  I don't recognize that particular query at all...  A mod list might help.
 - Bots...  Can you correlate problem times with high levels of bot activity?

Also, the basics:  MySQL version, PHP version.  Some stats on forum size might help: total messages, members, topics.

Although I don't recognize that query, I don't think it should be expensive.  It looks similar to our alert fetch query, and it's working from a small, finite set of topics.  However, when your forum is swamped, even simple queries can take an unusually long time to execute... 

OTOH...  It makes no sense to me that that query looks at 59k rows....

I've also never seen the query upside-down like that... 
A question worth asking is born in experience & driven by necessity. - Fripp

cookdandbombd

Quote from: shawnb61 on February 05, 2025, 09:21:14 PMI've also never seen the query upside-down like that... 

Lmao, I've never been much good at Linux, I was using tac. 

Sorry for not providing more info, I'll do that now while I chase up your other recommendations.

The forum has been going for 21 years and was converted over from UBB and other forum software at some point.  I'm getting older and have learning problems that make this all a bigger struggle for me nowadays, so I really appreciate your patience and assistance.

QuoteTotal Posts: 5,829,196
Total Topics: 110,905
Online today: 758
Members: 18,265
Online: 586 Guests, 91 Users (0 Buddies, 5 Spiders)

I used shawn's posts to block about 700 bots before Xmas.  The guests seem quite high to me, but I'm not sure if bots are the problem are not. 

The forum is up-to-date at version 2.1.4.

The mods I'm using are: SMF Gallery Lite, Custom Greeting, Curve2 Color Changer, Remixed Breadcrumbs, Quick Spoiler, TinyPortal, Simple Audio Video Embedder, Avatars Display Integration (I have a feeling this one has caused me slowdown before, but haven't noticed much difference when I uninstalled it a while back), Tagging System (with bits of it disabled as it seems like it can be a real drain. I have over 250k tags now), Stop Forum Spam, SMF Center Banner, Forum Width Setting, FA Board Icons.

MySql version:  8.0.41-0ubuntu0.20.04.1

Php Version: PHP 7.4.3-4ubuntu2.28

cookdandbombd

I see a lot of "InnoDB migration requests" when I tun MySQLTuner, and chatGPT says this:

Quote📌 MySQLTuner InnoDB Migration Request
If MySQLTuner is recommending an InnoDB migration, it usually means that you have MyISAM tables, and it suggests converting them to InnoDB for better performance, reliability, and ACID compliance.

Here's the red results from your tool.

shawnb61

2 new requests... 

First, can you run an explain on that query & share the results?  E.g., run this guy:
EXPLAIN SELECT t.id_first_msg AS id_first_msg , m.poster_time AS date
    FROM smf_topics AS t
    INNER JOIN smf_boards AS b ON t.id_board = b.id_board
    INNER JOIN smf_messages AS m ON t.id_first_msg = m.id_msg
WHERE t.id_topic IN(94173,68296,68445,94191,94325,94142,96764,97117,97152,97826,99723,99688,101284,101743,102727,102950,103819,104911,104962,105023,105171,105793,106159,106155,106419,106444,106499,106917,107274,107288,107275,107848,107902,107912,107938,108248,108412,108466,108057,108790,108847,108909,108860,109073,109559,111629,112224,112235,112290)
ORDER BY date DESC;

Second, can you download a current, updated version of my smf_db_compare.php utility & rerun it?  I tweaked it...
- Two of those 4 items are clearly in error & pointed to a bug in my utility (the member_groups errors).  ::)
- One of those items, I am unsure; let's see if it's flagged this time with the new version (the real_name idx). 
- One of those items, sorry, I jumped the gun & added an item based on GitHub activity when I thought 2.1.5 was imminent (id_board)...  That one should remain upon rerun.

A question worth asking is born in experience & driven by necessity. - Fripp

cookdandbombd

Quote from: shawnb61 on February 06, 2025, 08:16:16 PM2 new requests... 

I'm sorry about the formatting here, any attempts to fix it only made it worse and harder to read:

+----+-------------+-------+------------+--------+------------------------------                                                                                -----------------------+---------+---------+----------------------+------+------                                                                                ----+--------------------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys                                                                                                                       | key     | key_len | ref                  | rows | filte                                                                                red | Extra                                                  |
+----+-------------+-------+------------+--------+------------------------------                                                                                -----------------------+---------+---------+----------------------+------+------                                                                                ----+--------------------------------------------------------+
|  1 | SIMPLE      | t     | NULL       | range  | PRIMARY,firstMessage,last_mes                                                                                sage_sticky,board_news | PRIMARY | 3       | NULL                 |   49 |   100                                                                                .00 | Using index condition; Using temporary; Using filesort |
|  1 | SIMPLE      | b     | NULL       | eq_ref | PRIMARY                                                                                                                             | PRIMARY | 2       | cookd.t.id_board     |    1 |   100                                                                                .00 | Using index                                            |
|  1 | SIMPLE      | m     | NULL       | eq_ref | PRIMARY                                                                                                                             | PRIMARY | 4       | cookd.t.id_first_msg |    1 |   100                                                                                .00 | NULL                                                   |
+----+-------------+-------+------------+--------+------------------------------                                                                                -----------------------+---------+---------+----------------------+------+------                                                                                ----+--------------------------------------------------------+
3 rows in set, 1 warning (0.00 sec)


Included the id_board error again in the screenshots for completeness, I do note you say it's not relevant.



I will add that I see a great increase in stability already having increased the buffer size! We must have got linked somewhere yesterday and I can't find out where from the referral logs, for whateever reason, but the load balancing kicked in, and the site stayed up throughout all the increased traffic.

cookdandbombd

I figured I might have copied in the wrong query seeing as I was displaying things upside down! :-)  So I just looked through the slow query log but that data seems to be gone now.

I found two similar queries which were also looking at a ton of rows - I can do an EXPLAIN on those too if it's confirmed that it's safe to do so:

# Time: 2025-02-05T10:09:11.073780Z
# Query_time: 7.720256  Lock_time: 0.000009 Rows_sent: 1  Rows_examined: 39261
SET timestamp=1738750143;
SELECT COUNT(*)
FROM smf_messages AS m
INNER JOIN smf_topics AS t ON (t.id_topic = m.id_topic)
WHERE
EXISTS (
SELECT bpv.id_board
FROM smf_board_permissions_view AS bpv
WHERE bpv.id_group IN (0,124)
AND bpv.deny = 0
AND bpv.id_board = m.id_board
) AND m.id_member = 10253
AND m.approved = 1
AND t.approved = 1;


# Time: 2025-02-05T04:59:27.001434Z
# Query_time: 7.043998  Lock_time: 0.000005 Rows_sent: 1  Rows_examined: 36488
SET timestamp=1738731559;
SELECT COUNT(*)
FROM smf_messages AS m
INNER JOIN smf_topics AS t ON (t.id_topic = m.id_topic)
WHERE
EXISTS (
SELECT bpv.id_board
FROM smf_board_permissions_view AS bpv
WHERE bpv.id_group IN (118,124)
AND bpv.deny = 0
AND bpv.id_board = m.id_board
) AND m.id_member = 5356
AND m.approved = 1
AND t.approved = 1;

shawnb61

Yeah, that initial report of 59K made no sense at all.  49 records is correct. 

You will need to find a way to add that missing index.  That's an issue.  But it's not the issue with these queries.

Yes, you should be able to run an EXPLAIN on those as well.  Just add the word EXPLAIN at the beginning.  But those new queries really aren't doing a lot of work, certainly not enough work to warrant 7+ seconds of execution time. 

Run the explain a few times to see if the results are consistent.

A question worth asking is born in experience & driven by necessity. - Fripp

shawnb61

To add that index, it will look something like:

ALTER TABLE smf_members ADD INDEX idx_active_real_name (is_activated, real_name);
I'm not at my computer to test, but it will be something like that.  Rerun the db compare & see if that error goes away.  I think this will help... a little...


A question worth asking is born in experience & driven by necessity. - Fripp

shawnb61

Also...  How many membergroups do you have?
select count(*) from smf_membergroups;
A question worth asking is born in experience & driven by necessity. - Fripp

cookdandbombd

Quote from: shawnb61 on February 07, 2025, 03:04:00 PMAlso...  How many membergroups do you have?
select count(*) from smf_membergroups;

11

Have added the index now as per your instructions, thank you Shawn. 



Quote+----+--------------+-------------+------------+--------+-------------------------------------------------------------------------------------------+-----------+---------+----------------------------+------+----------+--------------------------+
| id | select_type  | table      | partitions | type  | possible_keys                                                                            | key      | key_len | ref                        | rows | filtered | Extra                    |
+----+--------------+-------------+------------+--------+-------------------------------------------------------------------------------------------+-----------+---------+----------------------------+------+----------+--------------------------+
|  1 | SIMPLE      | <subquery2> | NULL      | ALL    | NULL                                                                                      | NULL      | NULL    | NULL                      | NULL |  100.00 | NULL                    |
|  1 | SIMPLE      | m          | NULL      | ref    | ID_MEMBER,idx_id_board,participation,showPosts,id_member_msg,current_topic,idx_related_ip | showPosts | 5      | const,<subquery2>.id_board |  195 |    10.00 | Using where              |
|  1 | SIMPLE      | t          | NULL      | eq_ref | PRIMARY,approved                                                                          | PRIMARY  | 3      | cookd.m.id_topic          |    1 |  100.00 | Using where              |
|  2 | MATERIALIZED | bpv        | NULL      | range  | PRIMARY                                                                                  | PRIMARY  | 2      | NULL                      |  38 |    10.00 | Using where; Using index |
+----+--------------+-------------+------------+--------+-------------------------------------------------------------------------------------------+-----------+---------+----------------------------+------+----------+--------------------------+
4 rows in set, 2 warnings (0.00 sec)

Consistent after running it a bunch of times.  Same with this one:

Quotemysql> EXPLAIN SELECT COUNT(*) FROM smf_messages AS m INNER JOIN smf_topics AS t ON (t.id_topic = m.id_topic) WHERE EXISTS (SELECT bpv.id_board FROM smf_board_permissions_view AS bpv WHERE bpv.id_group IN (0,124) AND bpv.deny = 0 AND bpv.id_board = m.id_board) AND m.id_member = 10253 AND m.approved = 1 AND t.approved = 1;
+----+--------------+-------------+------------+--------+-------------------------------------------------------------------------------------------+-----------+---------+----------------------------+------+----------+--------------------------+
| id | select_type  | table      | partitions | type  | possible_keys                                                                            | key      | key_len | ref                        | rows | filtered | Extra                    |
+----+--------------+-------------+------------+--------+-------------------------------------------------------------------------------------------+-----------+---------+----------------------------+------+----------+--------------------------+
|  1 | SIMPLE      | <subquery2> | NULL      | ALL    | NULL                                                                                      | NULL      | NULL    | NULL                      | NULL |  100.00 | NULL                    |
|  1 | SIMPLE      | m          | NULL      | ref    | ID_MEMBER,idx_id_board,participation,showPosts,id_member_msg,current_topic,idx_related_ip | showPosts | 5      | const,<subquery2>.id_board |  195 |    10.00 | Using where              |
|  1 | SIMPLE      | t          | NULL      | eq_ref | PRIMARY,approved                                                                          | PRIMARY  | 3      | cookd.m.id_topic          |    1 |  100.00 | Using where              |
|  2 | MATERIALIZED | bpv        | NULL      | range  | PRIMARY                                                                                  | PRIMARY  | 2      | NULL                      |  33 |    10.00 | Using where; Using index |
+----+--------------+-------------+------------+--------+-------------------------------------------------------------------------------------------+-----------+---------+----------------------------+------+----------+--------------------------+
4 rows in set, 2 warnings (0.00 sec)


cookdandbombd

I found a few more examples. in the hope it will help us zero in on what's going on. Actually thought I had cracked it there when I got a response back saying smf_log_topics_unread doesn't exist, but from searching, I see that it's a temporary table and so that would explain it. 

# Time: 2025-02-08T00:29:11.316839Z
# Query_time: 8.531309  Lock_time: 0.000004 Rows_sent: 1  Rows_examined: 83966
SET timestamp=1738974542;
EXPLAIN FORMAT=JSON  SELECT COUNT(*)
                        FROM smf_messages AS m
                                INNER JOIN smf_topics AS t ON (t.id_topic = m.id_topic)
                        WHERE
                        EXISTS (
                                SELECT bpv.id_board
                                FROM smf_board_permissions_view AS bpv
                                WHERE bpv.id_group IN (116,124)
                                        AND bpv.deny = 0
                                        AND bpv.id_board = m.id_board
                        ) AND m.id_member = 192
                                AND m.approved = 1
                                AND t.approved = 1;

# Time: 2025-02-08T01:23:48.542814Z
# Query_time: 61.365994  Lock_time: 0.000014 Rows_sent: 40  Rows_examined: 683557
SET timestamp=1738977767;

EXPLAIN SELECT
                ms.subject AS first_subject, ms.poster_time AS first_poster_time, ms.id_topic, t.id_board, b.name AS bname,
                t.num_replies, t.num_views, ms.id_member AS id_first_member, ml.id_member AS id_last_member, meml.avatar, meml.email_address, mems.avatar AS first_poster_av>
                ml.poster_time AS last_poster_time, COALESCE(mems.real_name, ms.poster_name) AS first_poster_name,
                COALESCE(meml.real_name, ml.poster_name) AS last_poster_name, ml.subject AS last_subject,
                ml.icon AS last_icon, ms.icon AS first_icon, t.id_poll, t.is_sticky, t.locked, ml.modified_time AS last_modified_time,
                COALESCE(lt.id_msg, lmr.id_msg, -1) + 1 AS new_from, SUBSTRING(ml.body, 1, 385) AS last_body,
                SUBSTRING(ms.body, 1, 385) AS first_body, ml.smileys_enabled AS last_smileys, ms.smileys_enabled AS first_smileys, t.id_first_msg, t.id_last_msg
                        FROM smf_messages AS ms
                                INNER JOIN smf_topics AS t ON (t.id_topic = ms.id_topic AND t.id_first_msg = ms.id_msg)
                                INNER JOIN smf_messages AS ml ON (ml.id_msg = t.id_last_msg)
                                LEFT JOIN smf_boards AS b ON (b.id_board = ms.id_board)
                                LEFT JOIN smf_members AS mems ON (mems.id_member = ms.id_member)
                                LEFT JOIN smf_members AS meml ON (meml.id_member = ml.id_member)
                                LEFT JOIN smf_attachments AS af ON (af.id_member = mems.id_member)
                                LEFT JOIN smf_attachments AS al ON (al.id_member = meml.id_member)
                                LEFT JOIN smf_log_topics_unread AS lt ON (lt.id_topic = t.id_topic)
                                LEFT JOIN smf_log_mark_read AS lmr ON (lmr.id_board = t.id_board AND lmr.id_member = 12182)
                        WHERE b.id_board IN (1, 2, 4, 5, 6, 8, 11, 14, 16, 18, 19, 20, 21, 25, 26, 30, 33, 35, 29)
                                AND t.id_last_msg >= 359
                                AND COALESCE(lt.id_msg, lmr.id_msg, 0) < t.id_last_msg
                                AND ms.approved = 1
                                AND COALESCE(lt.unwatched, 0) != 1
                        ORDER BY t.id_last_msg DESC
                        LIMIT 280, 40;

shawnb61

#15
If you re-run any of these queries, are they consistently slow, or are they quick sometimes & slow sometimes?

If I read the explains above correctly, those particular executions took 0.0 sec.

If they are inconsistent, I suspect there are two factors here:
- The DB needs more tweaking, as in buffer sizes, etc., for a forum your size
- Your workload varies widely, due to bots...

If a DB is logjammed, even a simple query might take several seconds.  That's not a query thing, that's either DB config or workload causing resource issues. 

I can't help too much on DB config, buffer pool sizes, etc.  You need someone with better DBA skills than I have.  My degree of knowledge is at the "bigger buffers are better" level...

If your site is being saturated by bots, you can take a peek at my robots.txt or .htaccess up here:
https://github.com/sbulen/SMF-bot-hygiene

That might help.

There are two non-standard actions I've taken on SMF that have leveled out my MySQL CPU quite a bit.  Two activities that are extremely lightweight, but when taken in bulk add up, are alert popup checks & session writes.  Tiny queries, but sometimes they took a while, because my system was overloaded. 

On the alerts front, I was sometimes getting 150K/200K alerts popup checks a day.  I cut those in half by changing the alert ping time from 10sec to 20sec, by changing pingTime from 10000 to 20000 here: https://github.com/SimpleMachines/SMF/blob/4a23d610e887510906c3bb01eecb8b4c9e49dd23/Themes/default/scripts/alerts.js#L1.  The downside of this is that end-users may wait another 10 sec for an alert to show up.

On the sessions front, I no longer write sessions when cookies are not detected.  This is one part of this PR:
https://github.com/SimpleMachines/SMF/pull/8394

It was hard to tell how much cutting alert checks in half affected MySQL time, tbh.  I had taken several steps at once.  I kept it at 20 seconds, though, because nobody even noticed, and it drastically reduced the # of SMF actions.

OTOH, I have seen DRAMATIC improvement on my MySQL charts by applying PR #8394.  I do not get massive spikes in MySQL CPU anymore... And I don't see those weird massive spikes in 'who's online' either. 

I have a little modlet for each of these two actions.  If interested let me know.
A question worth asking is born in experience & driven by necessity. - Fripp

cookdandbombd

Quote from: shawnb61 on February 11, 2025, 04:48:44 PMI have a little modlet for each of these two actions.  If interested let me know.

Yes please Shawn!

shawnb61

Quote from: cookdandbombd on February 12, 2025, 03:17:06 AMYes please Shawn!

OK.  Please read the notes on PR#8394 very carefully here: https://github.com/SimpleMachines/SMF/pull/8394

Especially the part about loss of functionality.  Short version: anything dependent on passing PHPSESSID via URL won't work.  This includes some bot activity, & certain actions for guests who have cookies disabled on their browser.

Any feedback you have would be welcome, especially if you can provide CPU charts before & after.  You're basically helping test the PR.

These modlets should install clean on vanilla 2.1.4.  If you have any issues with installation, cancel out & just don't do it.

If/when PR8394 (or a superceding PR) is folded into a patch you will need to uninstall the modlet in order for the patch to install.

You cannot view this attachment.
A question worth asking is born in experience & driven by necessity. - Fripp

cookdandbombd

Thanks again!  It is all working a lot better now since I increased the join buffer and the swap file.  I installed these patches at 9 tonight.


cookdandbombd

@shawnb61 Just an update here Shawn, a day after installing those modlets my CPU and load went absolutely through the roof and stayed there for 8 hours or more. I couldn't see any reason why.  So I uninstalled these modlets and the glossary mod I had just reinstalled, and it eventually went back to normal.

I don't know if it was related or not.  There is some protection on my forum that someone put on for me which stops e.g. things like the settings file being able to be written to unless I disable the protection - I feel like perhaps it was something to do with that and the PHPSESSID modlet, but have no idea really. 

I'm tempted to try again but am happy with the performance now and would prefer not to risk rocking the boat :-) 

Going through my bot blocking config, I realised I hadn't made the detection for the bot agents case-insensitive!  Fixing that has brought my guests down by hundreds and hundreds :-)

I will therefore mark this as solved now, thank you again for your help and patience.

Advertisement: