News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

Board Stats Possible?

Started by bobdole2281, August 24, 2013, 05:46:01 AM

Previous topic - Next topic

bobdole2281

Is it possible to get the stats from one of my boards specifically? Does smf have a function for that? And if so, where would I look to find it? Thanks in advance :)

Example:
Find the threads/posts in only General Discussion. (Or even a roundabout way to do this?)(More specifically, find posts/threads this month only)

Arantor

No function for that, you'll have to write your own query for it.

bobdole2281

Hmm, okay. How should I start. I need to get posts from all boards. => Then xplode it into categories (Board) => Then xplode it into month.

Any kind of functions for getting posts from all categories? I'm sorry for being so nosey. Where could I find the functions on this site?

Arantor

Make your mind up what you want... first you want stats for an individual board, then you want stats for everything subdivided by board... these are two very, very different things.

bobdole2281

I'm not sure what the difference is. I need a starting point to a program that can get stats forbposts in gernal discussion.

Specifically whoever posted the month each month.

Arantor

Wait... you don't understand the difference between getting stats for one board, versus getting the stats for all the boards at once?

Sounds to me like you don't have any real idea what you *do* want.

bobdole2281

#6
Quote from: Arantor on August 25, 2013, 11:32:35 AM
Wait... you don't understand the difference between getting stats for one board, versus getting the stats for all the boards at once?

Sounds to me like you don't have any real idea what you *do* want.

I just said I need stats for general discussion.


Do you think any of these are a good starting point:

$context['common_stats']

$context['common_stats']['latest_member']['link']

$context['common_stats']['total_topics']

$context['common_stats']['total_members']

$txt['posts_made']


bobdole2281

What about my own sql query. What about an inner join of  `smf_topics` and members? Would that be in the right direction?

Arantor

It would be helpful to know what stats you actually want. I'm absolutely certain you have no idea what those variables contain (and didn't look) because you'd know that $txt['posts_made'] is totally unsuitable.

So we're back to you not really knowing what you want and/or haven't thought it through (which means I don't want to waste time writing queries to help you because I'm damned if I'm doing something multiple times over)

We've established you want the stats for a board (and only one board). It would help to know what board id that has (since you *never* want to query the database looking for a board by its name for performance reasons), but more importantly what stats you actually want so that the proper query can be written.

A query, in case it wasn't abundantly clear, is essentially a question asked of the data. Asking the wrong question will get you the wrong answer.

bobdole2281

I know exactly what I want. It is all very much planned out and has been for weeks. I just have no clue how to describe it to your liking. Here is exactly what I am looking for:

Category 1 => General Discussion, Comic book movie news, & Memes
Category 2
Category 3

I want to find out the top poster of General Disucssion. I also want to find out the top poster in Comic book movie news. And also, in memes. Then as a 4th I want something a little different. I want who created the largest thread (in any board). Also, I need to exclude admins (or atleast just me) or the whole thing is pointless.

So I think I have to make a custom SQL query. Based on your replies to the variables I posted and the lack of luck of finding anything after browsing for hours.

Arantor

QuoteI just have no clue how to describe it to your liking. Here is exactly what I am looking for:

You start by assuming that I know nothing about what you're trying to do and explain it in terms that a child could understand. (This forces you to think through every aspect of it and omit nothing in explanation.)

QuoteI want to find out the top poster of General Disucssion.

So, for the board of General Discussion (which is which board id?), you want to know the most prevalent poster. Most prevalent of all time? Most prevalent of the past x days, weeks, months? Excluding admins? Excluding admins and moderators?

QuoteI also want to find out the top poster in Comic book movie news. And also, in memes.

And the same questions to all of the above.

QuoteThen as a 4th I want something a little different. I want who created the largest thread (in any board).

Even including boards that the user viewing the page might not be able to see? Hypothetical argument: in your private admin board (presuming you have one), there's a topic that covers some stuff and it ends up being the longest thread on the site. Unlikely but it does happen. Should this thread be excluded from calculations?

Do you want literally just the user who created it or a link to the thread too?

QuoteAlso, I need to exclude admins (or atleast just me) or the whole thing is pointless.

There's a huge difference, excluding all admins or just excluding you?

QuoteSo I think I have to make a custom SQL query. Based on your replies to the variables I posted and the lack of luck of finding anything after browsing for hours.

You'll need one query for each thing you want, by the looks of things, but that will depend on going through and answering every question I've asked. I'm not being a deliberate pain here - all the questions I'm asking do actually make a huge difference to the queries that need to be written and then the code that works with the data once queried (since I have the feeling I'll need to write that for you too)

Also note that what you're asking for in that last post is different to what you asked for originally - and had I written the code you originally asked for, you'd only be back saying it wasn't what you wanted, which would mean wasted time for both of us.

bobdole2281

The board IDs are:

General Discussion: 1
Comic Movie News: 3
Memes: 22

I'm not even using an admin board, so that's not a problem. Let's not worry about excluding that for now.  Also, I want to exclude admins if possible. Plan B, is just to exclude me

Arantor

I'm not being funny but I asked a lot more questions than you've answered there.

I'm sorry, I just don't have the patience for this. I don't like having to ask multiple times for the same information when it's relevant (because if it wasn't relevant I wouldn't be asking)

bobdole2281

You're right there were a lot more. My bad.

QuoteSo, for the board of General Discussion (which is which board id?), you want to know the most prevalent poster. Most prevalent of all time? Most prevalent of the past x days, weeks, months? Excluding admins? Excluding admins and moderators?
Most prevalent each month. Exluding admins. Not Moderators

QuoteDo you want literally just the user who created it or a link to the thread too?
The Link would be great, but is not neccesary

QuoteThere's a huge difference, excluding all admins or just excluding you?
All  admins

You don't have to write anything for me and I am not trying to be ignorant just to tick you off. I am just trying to find information on how to do something. It appears your the only one around here who ever answers my questions though.

I realize that their will be a different query for each section. I have been browsing my database for days and can't find out how to grab both BoardID and user.

Arantor

QuoteMost prevalent each month

What month? Current month? Previous month?

QuoteIt appears your the only one around here who ever answers my questions though.

That's because there are extremely few capable programmers around here any more :(

QuoteI have been browsing my database for days and can't find out how to grab both BoardID and user.

That's where queries come in. But I guarantee it will not be a fast query to fetch this, going through all the posts is a slow process.

bobdole2281

Quote from: Arantor on August 28, 2013, 04:19:59 PM
What month? Current month? Previous month?

Current Month. I'm going to do this every month.

QuoteThat's where queries come in. But I guarantee it will not be a fast query to fetch this, going through all the posts is a slow process.

And yes, that's what I was afraid of. I probably won't run this onload of the page then. I'm getting more and more familiar with queries every day. I need to figure out some queries I can start on and grow until I have it.

Arantor

Even when you're on the first of a month and thus only have that day's stats to consider? Or did you mean the last 30 days' worth of stats?

Both are possible, but require quite different queries.

As for 'getting started', SMF runs many many different queries to perform its tasks. In the scheme of SQL, they're very straightforward and simplistic though. You really don't need to master SQL fully to be able to use it competently. (The longest query I've ever written for MySQL/SMF was 23 lines. The longest query I've ever seen is a stored procedure a colleague of mine wrote which extended to almost 2000 lines of code.)

bobdole2281

Quote from: Arantor on August 29, 2013, 04:24:18 PM
Even when you're on the first of a month and thus only have that day's stats to consider? Or did you mean the last 30 days' worth of stats?

Both are possible, but require quite different queries.

As for 'getting started', SMF runs many many different queries to perform its tasks. In the scheme of SQL, they're very straightforward and simplistic though. You really don't need to master SQL fully to be able to use it competently. (The longest query I've ever written for MySQL/SMF was 23 lines. The longest query I've ever seen is a stored procedure a colleague of mine wrote which extended to almost 2000 lines of code.)

Wow, that would be an intense query. I want it to consider each month,Ex: June, July, August. Not the last 30 days. I will be making some other code that limits this stuff during the first of the month already. Although I guess, it's possible to include it here as well if it's not too complicated.

Arantor

It's all doable but it's extremely difficult when 1) not all the consequences have been talked about and 2) the requirements feel like they're changing every time I ask more deep questions.

bobdole2281

The requirements aren't changing, I promise you that. I am doing a poor job of communicating is all. I'll try to be more clear because I understand every little line matters.


Although I am flexible on a few parts, but that essentially means whichever is easiest. I appreciate you helping me narrow down the code like this. You've been extremely helpful.

bobdole2281

I'm working on a query now. I'm thinking using   `smf_boards` and `smf_members` . I feel like I'd need one more though. Any ideas?

butch2k

I'm a bit lazy and most of all tired, so there is probably a more efficient way to do it but here is something to start with:

SELECT id_board, GROUP_CONCAT(distinct concat('member ',c.member,': ',c.cnt,' posts') order by c.cnt desc)
FROM smf_messages
JOIN ( SELECT id_board as board, id_member as member, count(id_msg) as cnt
  FROM smf_messages
  WHERE
    poster_time >= UNIX_TIMESTAMP(DATE_SUB(CURDATE(), INTERVAL DAYOFMONTH(CURDATE())-1 DAY))
  GROUP BY id_board, id_member
) c
ON c.board = id_board
GROUP BY id_board
ORDER BY id_board desc


of course the concat is for debug purpose, alter it for production use. This query (if i did not make any mistake...) should give you by boards the posters of the current month ordered by the number of posts.


Arantor

And it won't work inside SMF due to subselects being disabled... ;)

butch2k

Quote from: Arantor on August 31, 2013, 04:41:06 PM
And it won't work inside SMF due to subselects being disabled... ;)
$modSettings['disableQueryCheck'] ...
Yes i know it's dirty...  :-X

The restriction on subselect is really painful when it comes to really complex queries.  :-[

bobdole2281

Awesome. Thanks for the replies man, I really appreciate it.

Here is what I got:

$modSettings['disableQueryCheck'];

       
// All my login info here

$conM = mysqli_connect($hostnameM, $usernameM, $passwordM, $dbnameM) or DIE ('Unable to connect to database');

$SQL = "

SELECT id_board, GROUP_CONCAT(distinct concat('member ',c.member,': ',c.cnt,' posts') order by c.cnt desc)
FROM smf_messages
JOIN ( SELECT id_board as board, id_member as member, count(id_msg) as cnt
  FROM smf_messages
  WHERE
    poster_time >= UNIX_TIMESTAMP(DATE_SUB(CURDATE(), INTERVAL DAYOFMONTH(CURDATE())-1 DAY))
  GROUP BY id_board, id_member
) c
ON c.board = id_board
GROUP BY id_board
ORDER BY id_board desc";

/************************************************
Select all members and ID's
************************************************/
$resultM = mysqli_query($conM, $SQL);

while ($row = mysqli_fetch_array($resultM))
{
echo $row['id_member'];
echo $row['id_board'];
}

mysqli_close($conM);


What do I need to echo here, all I'm getting is: 2322164321

margarett

You are echoing the user and board ids with nothing between then. You need at least a line break or a space between them.
And now that you have the ids, what do you want to do with them? :P
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

bobdole2281

I get that I am echoing them with nothing in between them. The echo member returns blank.

The echo board is returning a board number. I'm just confused at what butch2k was intending me to echo. Do I even need the board ID's? Perhaps they tie in somewhere?

I'm looking for the member with the most posts in each group.

butch2k

It's the GROUP_CONCAT(distinct concat('member ',c.member,': ',c.cnt,' posts') order by c.cnt desc) which maters, it list by board the posters ordered by number of posts.

bobdole2281

Okay, wow. Thanks for the help :)! What should I put to display that information?

The "by board the posters ordered by number of posts", is it in a separate Select statement?

butch2k

first of all run the query i posted then modify the concat for your needs so that you can use explode for instance to get everything in an array. The concat i posted is for readability.

bobdole2281

Thanks for the replies butch! Alright, I'll run that as soon as I can.

bobdole2281

#31
This worked for me:

Select *
FROM  smf_messages
WHERE id_board = 3
And poster_time > 1378015261
And poster_name <> 'bobdole'
And poster_name <> 'hawkeye'
And poster_name <> 'DeanDoomIII'
And poster_name <> 'Captain Banhammer'


Those last posters are the 4 admins

1378015261 = September 1st in unixtimestamp

Arantor

So you collect every aspect of every message in the table. That query is going to absolutely hammer the database every time it's used. And I mean HAMMER the database.

bobdole2281

Good idea, I should just grab id_board, poster_time, and poster_name then correct?

Or can you think of any others I need?

margarett

I think that, if you just need the user with most posts in a certain board, you should try to get a query that extracts as little information as possible so that you have your results.

A "SELECT *" query will gather a LOT of information from the database, not to mention the HUGE amount of memory it will occupy... (of course, this depends on the size of the forums...)
Se forem conduzir, não bebam. Se forem beber... CHAMEM-ME!!!! :D

QuoteOver 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Arantor

Well, since I never actually got an answer to what information you *actually* wanted, I have no idea if that's right.

Mind you, I still wouldn't do it that way, I'd be getting id_board, id_member and poster_time and then joining to smf_members to get their current member name (and only using poster_name if the member doesn't currently exist)

You see, you're clutching at straws on how to do this stuff and frankly you don't have much idea what you're doing, but you don't give anyone enough information to be able to actually help you make what you want. You need to spell it out in enough detail as to what you want - and not keep changing your mind about it. It's ALL relevant.

For example, you're getting the board id. Do you need the board id in the result set? Answer: almost certainly not. You're filtering on it, not getting it, you don't need to include it in the SELECT, only the WHERE.

Do you need all the post ids of all these posts? Answer: again, no, as I understand it you want the COUNT of posts, not the actual posts themselves.

Which means your total query is much closer to:

SELECT m.id_member, COUNT(m.id_msg) AS num_posts, IFNULL(mem.real_name, m.poster_name) AS poster_name
FROM smf_messages AS m
LEFT JOIN smf_members AS mem ON (m.id_member = mem.id_member)
WHERE id_board = 3
And poster_time > 1378015261
AND m.id_member NOT IN (1, 2, 3, 4)

where the 1,2,3,4 are the user ids of the those you want to exclude which is MUCH MUCH MUCH faster than doing the comparisons you were doing. Which aren't even accurate anyway because if the user changes their display name, the new display name is what's stored - but their member id won't change.

THIS is why I want to know everything I do - because you're asking the database the wrong question and ending up doing so much more work when the database can do it for you - and do it vastly more efficiently than you can - but only if you ask it the right questions. And to do that, you need to know what you're asking of it.

bobdole2281

Wow, thank you. That is some amazing code. I was getting the same result, but mine took about 20x longer and was far more complicated. I clearly have a lot to learn.

bobdole2281

SELECT m.id_member, COUNT( m.id_msg ) AS num_posts, IFNULL( mem.real_name, m.poster_name ) AS poster_name
FROM smf_messages AS m
LEFT JOIN smf_members AS mem ON ( m.id_member = mem.id_member )
WHERE id_board =4
AND poster_time >1378015261
AND poster_name <>  'bobdole'
AND poster_name <>  'hawkeye'
AND poster_name <>  'DeanDoomIII'
AND poster_name <>  'Captain Banhammer'


This code seemed to be working, but now it's not calculating correctly. It's not showing the top poster every time, sometimes it's showing the second top poster and the "num_posts" seems to be off.

Does num_posts count edited posts as well by any chance?

Arantor

It doesn't count edited posts. It doesn't show the top poster, it just gets everyone who's posted in that board who isn't the administrator, because that's what you asked for. If it wasn't what you wanted and you wanted it listed by a certain order, you should have specified.

Mind you, I can see you haven't taken the advice I gave you even though it was firmly for your benefit. I think at this point I will firmly bow out.

bobdole2281

What advice would that be? If it's explain what information I actually want, I've literally done nothing but try to do that here. I'm just trying to get this to work man.

Arantor

The part where I *very specifically* said NOT to use the list of names in the query.

QuoteAND m.id_member NOT IN (1, 2, 3, 4)

where the 1,2,3,4 are the user ids of the those you want to exclude which is MUCH MUCH MUCH faster than doing the comparisons you were doing. Which aren't even accurate anyway because if the user changes their display name, the new display name is what's stored - but their member id won't change.

And then, so you're getting a list of names and post counts. It isn't going to be returning those in any order. Did you want them in a certain order? Did you want only a certain number of them?

bobdole2281

SELECT m.id_member, COUNT(m.id_msg) AS num_posts, IFNULL(mem.real_name, m.poster_name) AS poster_name
FROM smf_messages AS m
LEFT JOIN smf_members AS mem ON (m.id_member = mem.id_member)
WHERE id_board = 3
And poster_time > 1378015261
AND m.id_member NOT IN (1, 7, 15, 48)


You were right. That way did work if I plugged in the right IDs. I didn't do it just to ignore you, I didn't know what it was or understand it at the time.

Here is what it displays:
id_member   num_posts   poster_name
98                   5                   DR. Strange

When I run it with my old statement it shows:

id_member   id_board   poster_time   poster_name
98   3   1378409542   DR. Strange
4   3   1378691872   Deadpool
4   3   1378691951   Deadpool
4   3   1378692121   Deadpool
4   3   1378692154   Deadpool

I need it to display these two variables:
Poster = Deadpool
NumberofPosts = 4

I'm not sure how to do that without an overly slow and complicated bunch of while statements. Is their a way to do it with count?

bobdole2281

SOLVED

Nevermind. I finally got it. Here is the sql if anyone is interested:

SELECT poster_name, COUNT(*) TotalCount
FROM smf_messages
INNER JOIN smf_members ON smf_messages.id_member = smf_members.id_member
WHERE id_board =22
AND poster_time >1378015261
AND smf_members.id_member NOT in (1,7,15,48)
GROUP BY poster_name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC


or

SELECT poster_name, COUNT(*) TotalCount
FROM smf_messages
INNER JOIN smf_members ON smf_messages.id_member = smf_members.id_member
WHERE id_board =1
AND poster_time >1378015261
AND poster_name <>  'bobdole'
AND poster_name <>  'hawkeye'
AND poster_name <>  'DeanDoomIII'
AND poster_name <>  'Captain Banhammer'
GROUP BY poster_name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC




Advertisement: