News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

Un-answeredTopics for SMF 2.0?

Started by Phphelp, August 24, 2013, 08:11:54 PM

Previous topic - Next topic

margarett

#21
So, care to try again? :P

I rewritten completely Unanswered.php (just used the post query and array construction as they pretty much the same) in a way that I understand what's going on there.
I only add 3/4 sql queries, and only one of them significant (the 10 posts per page). In my test forum, impact in page load time is NONE. But, I have little boards and messages to test...

Probably something will be missing :P but it seems to work fine here.
So, please replace your previous files with the ones here. The days to include in your "recent" unread are now in the beggining of the function, don't forget to adapt that

// Topics with no replies...
function UnansweredTopics()
{
//global $txt, $scripturl, $user_info, $context, $modSettings, $sourcedir, $board, $smcFunc;
global $txt, $scripturl, $user_info, $context, $modSettings, $sourcedir, $smcFunc;
loadTemplate('Unanswered');
$context['page_title'] = $txt['unanswered_topics'];

//What's the limit for showing posts?
$daysToGet = 100; //This parameter should come from ACP. Maybe later :P
$timeLimit = time() - ($daysToGet * 24 * 60 * 60);


It's now over 3AM, so I'm going to bed :P Let me know the outcome ;) AND BACKUP!
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Phphelp

Thanks Margarett, It works perfectly..

Go test it out on my forum and answer a few unanswered topics  :P

/cheers!


margarett

You're welcome.

Can you check if there is some performance hit? Activate the option to show page load time and compare average page times with this one, please.

There are some things yet to improve. One I noticed is that if you require an invalid category or board you are presented with a database error. But it's pretty much irrelevant for now :P
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Phphelp

@margarett, I turned on the page load times, I really don't know what good or bad times are with SMF.

Take a look and let me know when you're done testing..

margarett

Yup, unfortunately it really has a performance hit :(

This is your page loading time for the index page:
QuotePage created in 0.11 seconds with 10 queries.

And this is unanswered page as a guest:
QuotePage created in 1.136 seconds with 10 queries.

ONE SECOND MORE is really a LOT. I find it a bit strange as you don't have that many boards and most the "foreach" cycles are just for 1-3 variables.

Oh well... :-\
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Phphelp

I made it a lot faster...


Changed:

//Here we go. Get me some IDs, please :)
$request = $smcFunc['db_query']('', '
SELECT COUNT(m.id_msg) as total
FROM {db_prefix}messages as m
INNER JOIN {db_prefix}boards AS b ON (b.id_board = m.id_board)
INNER JOIN {db_prefix}topics AS t ON (t.id_topic = m.id_topic)
WHERE b.id_board IN ({array_int:list_boards})
AND m.approved = 1
AND t.num_replies = 0
AND m.poster_time > {int:time_limit}',
array(
'list_boards' => $boards_to_access,
'time_limit' => $timeLimit,
)
);


TO:

//Here we go. Get me some IDs, please :)
$request = $smcFunc['db_query']('', '
SELECT COUNT(m.id_msg) as total
FROM {db_prefix}messages as m
INNER JOIN {db_prefix}topics AS t ON (t.id_topic = m.id_topic)
WHERE m.id_board IN ({array_int:list_boards})
AND m.approved = 1
AND t.num_replies = 0
AND m.poster_time > {int:time_limit}',
array(
'list_boards' => $boards_to_access,
'time_limit' => $timeLimit,
)
);



I didn't see the need to join the Board table, since it doesn't pull any columns back from it. So I just swapped the where clause to the message table to have the same effect. Now it's 3x or so faster.


margarett

#27
And you would be very right :) I have no idea why I made it like that but I think thank makes real sense.
Now who would tell a COUNT query could take that long?

Even so, if it still takes around 0,4s it's still relevant (or not, just checked again and your index took 0,5s, but immediately after took just 0,15s so there is some variation)

edit: and that comment before the query is completely out of any reason :P
edit2: updated the post above with your correction ;)
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Arantor

If you really want to examine it, turn on full DB logging (add $db_show_debug = true; to Settings.php), grab the query actually being used and put an EXPLAIN in front of it to find out what indexes it's using.

Mind you, were I doing this, I'd be doing it very differently by turning it around. Something like:

//Here we go. Get me some IDs, please :)
$request = $smcFunc['db_query']('', '
SELECT COUNT(t.id_topic) as total
FROM {db_prefix}topics as t
INNER JOIN {db_prefix}messages AS m ON (t.id_first_msg = m.id_msg)
WHERE t.id_board IN ({array_int:list_boards})
AND t.approved = 1
AND t.num_replies = 0
AND m.poster_time > {int:time_limit}',
array(
'list_boards' => $boards_to_access,
'time_limit' => $timeLimit,
)
);


You're dealing with topics not replied to, which means you're looking for a topic that's been approved and has no replies. (This may screw up if you're dealing with topics that have pending approved posts but no other replies. That's an edge case and from the described site it shouldn't make a difference.)

By going off the topics table, it should be faster since it's not hitting up every message and then matching the topics, it's going off the topics and only matching to the (much larger, much slower) messages table once you're at that point.

margarett

Actually I was hoping you would step in :P That's the kind of expertise this topic was missing.
@Phphelp, care to try?
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Phphelp

I changed it over to what @Arantor suggested.

It seems about the same as before, maybe a bit faster. It's live so you all can take a look.

Arantor

Or you could even enable the debugging stuff I mentioned so you could see how long the query takes on its own, or whether it's not really the problem, or even if the time is repeatable on a regular basis - and then also see the query parsing information to see if it can be improved.

margarett

In my test forum is pretty much irrelevant, I think... I should get myself a big forum :P
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Phphelp

@arantor I didn't ignore the part where you suggested that, I just have limited access on what I can do from my workplace.

I was able to get to a QA testing machine at lunch and apply this index.

KEY `unsanswered_topics` (`num_replies`,`id_board`,`approved`,`id_first_msg`)

Which seemed to make a marked improvement.

Arantor

Interesting, because it shouldn't make a difference at all, if anything it should make it slower in the long run because of how MySQL applies indexes in the query parser.

How do you know if it's faster if you have absolutely no way to measure it reliably?

Still, good luck with this.

Phphelp

@Arantor  Just a sampling average.

The best way to improve the speed would be to duplicate the posters time on the topic's table. But I'm really ok with a half second or less for speed for this option. I use it multiple times a day every day.

@margarett Thanks so much for coding this, you rock!

Arantor

Sampling average is no good to you in this situation, that's the point I was trying to make. There are many other factors at work in this situation, like queries that may or may not be running at the same time. I can see situations where you might have only 3 or 4 other queries on the page and some where you might have 10-12 due to other stuff going on, so using the overall time to judge whether it's any better is absolutely useless to you, which is why I pointed out the proper debug stuff which includes profiling.

Not being funny but this is basic optimisation stuff: isolate all factors that are irrelevant to be able to examine the thing you're trying to improve.

Phphelp

@arantor I totally understand what you're saying.

@Margarett  I found another query that was overly complex, so I simplified and removed two table joins. I'm sue @Arantor will improve on it further.

Now my sampling times are trending lower then before.

Change:

//FINALLY!!! Let's get ourselves some posts, shall we? :)
$request = $smcFunc['db_query']('', '
SELECT
m.id_msg, m.subject, m.smileys_enabled, m.poster_time, m.body, m.id_topic, t.id_board, b.id_cat,
b.name AS bname, c.name AS cname, t.num_replies, m.id_member, m2.id_member AS id_first_member,
IFNULL(mem2.real_name, m2.poster_name) AS first_poster_name, t.id_first_msg,
IFNULL(mem.real_name, m.poster_name) AS poster_name, t.id_last_msg
FROM {db_prefix}messages AS m
INNER JOIN {db_prefix}topics AS t ON (t.id_topic = m.id_topic)
INNER JOIN {db_prefix}boards AS b ON (b.id_board = t.id_board)
INNER JOIN {db_prefix}categories AS c ON (c.id_cat = b.id_cat)
INNER JOIN {db_prefix}messages AS m2 ON (m2.id_msg = t.id_first_msg)
LEFT JOIN {db_prefix}members AS mem ON (mem.id_member = m.id_member)
LEFT JOIN {db_prefix}members AS mem2 ON (mem2.id_member = m2.id_member)
WHERE b.id_board IN ({array_int:list_boards})
AND m.approved = 1
AND t.num_replies = 0
AND m.poster_time > {int:time_limit}
ORDER BY m.id_msg DESC
LIMIT {int:offset}, {int:limit}',

array(
'list_boards' => $boards_to_access,
'time_limit' => $timeLimit,
'offset' => $_REQUEST['start'],
'limit' => 10,
)
);


TO:

//FINALLY!!! Let's get ourselves some posts, shall we? :)
$request = $smcFunc['db_query']('', '
SELECT
m.id_msg, m.subject, m.smileys_enabled, m.poster_time, m.body, m.id_topic, t.id_board, b.id_cat,
b.name AS bname, c.name AS cname, t.num_replies, m.id_member, m.id_member AS id_first_member,
IFNULL(mem.real_name, m.poster_name) AS first_poster_name, t.id_first_msg,
IFNULL(mem.real_name, m.poster_name) AS poster_name, t.id_last_msg
FROM {db_prefix}messages AS m
INNER JOIN {db_prefix}topics AS t ON (t.id_topic = m.id_topic)
INNER JOIN {db_prefix}boards AS b ON (b.id_board = t.id_board)
INNER JOIN {db_prefix}categories AS c ON (c.id_cat = b.id_cat)
LEFT JOIN {db_prefix}members AS mem ON (mem.id_member = m.id_member)
WHERE b.id_board IN ({array_int:list_boards})
    AND m.id_msg = t.id_first_msg
AND m.approved = 1
AND t.num_replies = 0
AND m.poster_time > {int:time_limit}
ORDER BY m.id_msg DESC
LIMIT {int:offset}, {int:limit}',

array(
'list_boards' => $boards_to_access,
'time_limit' => $timeLimit,
'offset' => $_REQUEST['start'],
'limit' => 10,
)
);

Arantor

What would be the point of trying to improve upon it? You have no idea whether you're actually improving it or not. Seems to me that any further input on my part would largely be useless.

Especially as there are now two pieces of advice of mine that you're not implementing.

margarett

Actually that query was completely copied from the original "unread" function :P I just added the conditions to the "WHERE ... AND" clause

@Arantor, your advice is of course valuable. It's actually my fault: I'm the "developer" :P in this case and I have no real conditions to try it...
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Advertisement: