News:

Wondering if this will always be free?  See why free is better.

Main Menu

[PAID] Thread view (AKA tree view)

Started by rbeuker, July 28, 2011, 05:36:24 PM

Previous topic - Next topic

rbeuker

Hi SikLiFe,

Yes, indeed it would be like that! :)

Best regards,

Ronald

Arantor

The thing that everyone actually forgets with implementing this is that putting aside ease of use (and I *still* haven't been convinced, heck I even went to Reddit lately and found myself even more confused than before), is that it's a performance killer.

The systems that are dedicated to it have dedicated systems and data structures that can support this, e.g. schemaless systems. SMF does not use a schemaless system, it uses MySQL.

MySQL is not designed to handle this set of data, it is designed to take separate items and link them together. To rethink this into MySQL's structure, you're talking about doing one of two things.

Either 1: you retrieve every single post in the topic and every single sub branch (and I do mean EVERY post) and figure out the tree structure to identify where you are in it. Workable but sluggish at best, no matter how lean things are. Best usable on broad and shallow trees of nesting where you can quickly eliminate side branch nodes that are of no direct relevance to where you are currently.

Or 2: for every given branch, you query and fetch the branch and recursively query branches down the tree within a given depth. For narrow trees this can work pretty well by denormalising data but again it's still going to suck in performance terms, particularly if the tree gets any depth to it.

FWIW, MyBB has this feature, they picked option 1. I find it interesting to note that it's off by default and the flat structure that SMF uses is the default. (And note how that implicitly screws up nesting by creating a massively wide and shallow tree whereby a 100 post topic is 1 parent node and 80+ 1-level branches)


Even after Clara ******ed at me and flat out called me a misogynist to my face because I wouldn't dance to her tune, I still investigated it. It is possible to do but not in a general sense with any efficiency (which is why it's off by default).

MrMike

Quote from: Arantor on June 15, 2012, 05:44:43 PMTo rethink this into MySQL's structure, you're talking about doing one of two things.

Either 1: you retrieve every single post in the topic and every single sub branch (and I do mean EVERY post) and figure out the tree structure to identify where you are in it. Workable but sluggish at best, no matter how lean things are. Best usable on broad and shallow trees of nesting where you can quickly eliminate side branch nodes that are of no direct relevance to where you are currently.

Or 2: for every given branch, you query and fetch the branch and recursively query branches down the tree within a given depth. For narrow trees this can work pretty well by denormalising data but again it's still going to suck in performance terms, particularly if the tree gets any depth to it.

There are other ways to do it, though. One thing that comes to mind is that a little AJAX properly applied would eliminate most or all of the performance problems. Most of the requests would be for the top level of most threads, and then additional requests to drill down. There's no need to grab a giant set of data when it can be supplied incrementally as needed. Performance would probably go up overall since requests would tend to be short in both duration and bandwidth.

For something like a message board mySQL is a perfectly fine tool to use, and I think to say that "MySQL is not designed to handle this set of data" is probably not accurate.

If query complexity or performance is becoming a problem then the schema probably needs to be recast. In terms of real world performance, nearly any entry level server has enough horsepower to drive even a pretty high traffic board or set of boards. That would, I suppose depend on the definition of "high traffic".

Just my 2 cents.

Arantor

You're missing the point: it doesn't matter whether you load it AJAXively or not: it's still going to suck when assessing part of a tree, because you still have to assess the entire tree to subslice it. And therein lies the problem.

The schema is absolutely fine for a flat structure the way SMF works (it's exactly how relationships are intended to work) but a threaded tree structure does not conform to how RDBMS relations are set up.


Please try and understand. This is not an entirely theoretical discussion about how it might work. I have tried to implement this. I understand the complications of performance because I've tried to make it happen and the performance hurt is noticeable even on tiny forums, it's going to scale incredibly badly.

This is the same problem MyBB faced, their implementation runs into the exact same problem. vBulletin has the same problem too.


And that's even before you get in to the nightmare of rethinking how SMF stores where someone has read to in a thread, that's purely just the hassle of dealing with messages and loading them.

rbeuker

Quote from: Arantor on June 15, 2012, 05:44:43 PM
Even after Clara ******ed at me and flat out called me a misogynist to my face because I wouldn't dance to her tune, I still investigated it. It is possible to do but not in a general sense with any efficiency (which is why it's off by default).

I definitely appreciate your thinking about this, even though you don't like it at all! :)  I personally know too little about SMFs data schema, so I'm glad you've added this interesting point from a developer's point of view.

When I started this thread, I presumed that 'something' would have to be changed in the database. Obviously, the parent-child relationships would have to be stored somewhere. Now I'm wondering: would it help (performance-wise) if these parent-child relations would be stored in a totally new table, keeping it separated from the existing 'flat' tables? That 'parent/child table' would then determine the order of the messages that have to be read from the regular tables.

Also, how badly would SMF's performance suffer? My own Forum runs on a quad core i7 server that's practically sleeping all day (the server, not the Forum lol). So it could use some challenge. ;)

CPU load averages 0.04 (1 min) 0.03 (5 mins) 0.00 (15 mins)
CPU usage         0% user, 0% kernel, 0% IO, 100% idle

Arantor

Therein lies the problem. How do you store such a thing?

There's two ways. Firstly, each post can deal solely with its immediate parent. Then you're talking about a single extra column in the database. The upside is that it uses very little extra space, the downside is that if you want to do anything other than find the immediate parent or immediate children, you're stuffed. There's no easy way to find children of children, so either you have to fetch every single message from the topic and manually reparse the tree, or you have to recursively query to keep finding children.

There are ways you can maybe clean it up, by limiting parse depth but even so it's still not very clear and only ever going to be efficient under some circumstances. This is the curse of trying to put a square peg in a round hole.

Then you have the other option, maintaining the tree at all times in a separate table but note that you have to store it in a messed up container format to flatten it in such a way that MySQL can handle it (a different symptom of round-hole-square-peg), but then you start using a lot of space, I was finding that the space taken up by this method could sometimes actually exceed the posts themselves.

Sure it was faster, though dealing with moves is slightly more tricky. But the space consumed by it (because you have to essentially replicate the tree in an array then serialize it for MySQL's benefit) outweighed it for me.

But what you're saying, about putting the parent/child into another table, that would actually make little or no difference in the scheme of things, the fact is you're still dealing with a system that would have to recursively process data, or grab it all and process it, whether that's in the main messages table or in a new table - neither has the potential directly to handle a tree structure, it's just not how MySQL is designed, and there's absolutely no escaping that reality.

If it were an option to make use of something like MongoDB or Cassandra, where you can actually store tree objects in a meaningful fashion, maybe it would be feasible, but I assure you, that sort of thing is the way to madness. Managing the same content between multiple systems is a headache and one that you couldn't pay me enough to work on.


Also, note that I did not implement it directly into SMF, I implemented into a heavily, heavily modified SMF. But one where I was able to identify the level of change and I noticed that even on a 100-post thread it was taking up to half a second to process. (And this was on a machine that is also a quad core i7, my laptop)

rbeuker

So, when will SMF enter the Big Data World? ;) :)

Many thanks for elaborating more on this topic! It got me think about things some more:

- Even though there are the drawbacks and risks that you have described, could a threaded view in your opinion still be something that smaller Forums can use, but bigger Forums (that have a lot of traffic) should avoid? For your reference, my Forum generates about 150,000 to 200,000 pageviews a month (as registered by SMF), so I suppose you'd consider it a low traffic Forum?

- I was wondering: when you were implementing this yourself, where you actually parsing and displaying entire messages in a 'tree'? I am asking, because I don't necessarily need that--just a small 'map' that just shows the message titles and the authors of these messages would already be a big help.

Having such a map would make it possible to very quickly find out of there's an answer to a question, asked by Forum member John on page 12. In a longer thread it could be on page 87! :P

And that just gave me another idea: how about automatically adding a little line to a message after someone has used the 'Quote' button? Something like this:

QuoteForum member Peter has replied to this on page 87 of this topic. You are on page 12 now and can click 75 times to get there, or click here to immediately go there now!

;D


Arantor

QuoteSo, when will SMF enter the Big Data World?

MongoDB is well known for having issues with so-called Big Data, though more recent versions have improved its reliability, just not particularly its scalability. In any case this isn't a Big Data problem, it's a schema problem and the idea solution is to move to a schemaless structure.

Quote- I was wondering: when you were implementing this yourself, where you actually parsing and displaying entire messages in a 'tree'?

No, that was simply the effort required to actually work out the message ids of said tree.

QuoteAnd that just gave me another idea: how about automatically adding a little line to a message after someone has used the 'Quote' button? Something like this:

Which again raises the problem of the tree.

From given node A, how can you know that node B is a child? Either you log the fact with node B, that node A is the parent, and perform a search when fetching node A (that there are replies), or you log the fact with node A that node B is a child. Great when fetching in that particular case, but painful the other way.

The whole problem is figuring out the map in the first place, because there's no efficient way to store or rebuild part of the tree in isolation, either you store enough to traverse it which keeps it lean, or you store the entire tree which is a massive space consumer.

Quoteso I suppose you'd consider it a low traffic Forum?

I was seeing serious problems with even fewer page views.

rbeuker

Hey hey :)

I'm back! With a new idea: how about storing the information about 'the map' into a separate table? It should probably have at least these columns:

ThreadID
MessageID
ParentMessageID (holds the MessageID of the current message's parent, or NULL if it's the first message at the top level)
MessageLevel (not sure if this would be of any benefit, but it could help drawing the map as it allows for selecting all messages that are on a specific level)

Every time a new message is posted, this table must be updated as well. And the table will have to be populated initially once (after having installed the mod for the first time).

Best regards,

Ronald

Arantor

Sorry to say it but this is absolutely no different to anything discussed above. The fact it's now in a separate table makes precisely zero difference. Everything you've mentioned has already been referred to for the purposes of trying to make this work.

You still have no way to know where you are in the tree, no way to know what your children are. (The parent is not, and has never been, a problem) It does not change the problem of rebuilding the map, it just changes where the data already needed for it would be stored.

Thing is, this problem is already present in SMF: the board hierarchy has this exact problem, and all the same problems are present there, which is why the whole 'child board of a child board' is never processed fully for performance.


The bottom line: there is absolutely no way you can *efficiently* make a relational table behave in a non-relational way. You can but do it in multiple horrendous fashions (cf what vBulletin and MyBB do, and even they suggest you don't; it's why Kier and co didn't add it to XenForo after they made it in vBulletin before...

live627

I would like to announce that I'm going to start work on this. User bounties/donations are accepted in PPUSD. You may PM me for my address.

Arantor

The notes I left on the private Wedge board yesterday about how this could be done reasonably efficiently might be of use.

live627


Hj Ahmad Rasyid Hj Ismail

Nice idea. I will be following this development.

MrMike

Quote from: live627 on October 11, 2012, 01:51:20 AMI would like to announce that I'm going to start work on this. User bounties/donations are accepted in PPUSD. You may PM me for my address.

If possible, I'd like to see a work-in-progress example. I would/will donate to this if I can see something that gives me hope. :)

live627

I put this mod on the back burner to make way for other works,  but I'll get up a demo soon.

legowife

I'm hoping against hope that this may have been updated, so I apologize if I'm pushing this thread up if it ought to be closed. I've read just about every discussion asking for this feature/mod and have combed through the mod database without avail. The closest it has seemed to come to realization is Nao's post with his proof of concept code from 2008 (made available for use with credit) and this one which sounds like work got started on it, but perhaps never completed: http://www.simplemachines.org/community/index.php?topic=255510.0 I may be joining the ranks of those doomed to be disappointed, but I thought I'd ask before I start thinking about migrating my forum. Too many of my users want this feature.

Just for background, the reason the flat view does not work as well for my forum is that we actually do need distinct conversations/threads/tangents within the same topic. Perhaps SMtF is not the right place for that, but again I'd rather check in before I rock the boat and change technologies.

Arantor

It was never completed and had some very unpleasant side effects in practical use.

It needs much more engineering than that to make it work even remotely efficiently and it also promotes fairly terrible UI in almost every implementation going.

completelayman

I searching it in thread:
simplemachines.org/community/index.php?topic=534832.0
a some solution:
simplemachines.org/community/index.php?topic=352940.msg2397570

Kindred

no. there is no current solution.

Quote from: Arantor on August 29, 2014, 08:06:14 PM
It was never completed and had some very unpleasant side effects in practical use.

It needs much more engineering than that to make it work even remotely efficiently and it also promotes fairly terrible UI in almost every implementation going.
Слaва
Украинi

Please do not PM, IM or Email me with support questions.  You will get better and faster responses in the support boards.  Thank you.

"Loki is not evil, although he is certainly not a force for good. Loki is... complicated."

Advertisement: