News:

Wondering if this will always be free?  See why free is better.

Main Menu

Optimized code with $smcFunc arrays and loops

Started by Glyph, September 21, 2016, 03:53:15 PM

Previous topic - Next topic

Glyph

$columns[] = array(
'table_name' => '{db_prefix}members',
'column_info' => array(
'name' => 'adk_notes',
'type' => 'varchar',
'size' => 150,
'default' => '',
'auto' => false,
'unsigned' => false,
),
'parameters' => array(),
'if_exists' => 'ignore',
'error' => 'fatal',
);

$columns[] = array(
'table_name' => '{db_prefix}members',
'column_info' => array(
'name' => 'adk_pages_notifications',
'type' => 'text',
'auto' => false,
'unsigned' => false,
),
'parameters' => array(),
'if_exists' => 'ignore',
'error' => 'fatal',
);

//Finally Create Column
foreach($columns AS $add)
$smcFunc['db_add_column']($add['table_name'], $add['column_info'], $add['parameters'], $add['if_exists'],$add['error']);


So I was recently looking at a plugin today, and I noticed something rather odd. As an indexed array this is perfectly fine and dandy, but the array is never unset or destroyed afterwards?

I figured a multi-dimensional array would fit better. But, after actually looking through it seems like using a multi-dim with a do until would be the same amount of lines as this indexed array with a foreach. So now it's down to which is faster? With only 6 tables I am beginning to think it doesn't actually matter, but just to have integrity while coding I think it's a big deal.

So, this is more of an optimization challenge. the default example (seen here) Does not use more than 2 tables. So I can't really just assume that's optimized for this situation.


Note: The default example doesn't really explain why the index array in "id_articles" also includes "id_categories" as an id key. This almost looks like a Foreign Key implementation to me, but I don't know a ton about myisam in general so I couldn't say. What do you all think?
Personal TODO:

Arantor

Quotebut the array is never unset or destroyed afterwards?

All variables will automatically be up for garbage collection once they fall out of scope.

QuoteI figured a multi-dimensional array would fit better.

It is a multi-dimensional array, it just happens to declare it one at a time and I strongly suspect this was borrowed from the one we built for SimpleDesk years ago, it looks kinda like that. But since you need to pass in one column at a time, it looks a little bit nicer to declare one at a time in terms of reading the code afterwards.

Outright declaring it as a single massive array doesn't use less memory, it uses a few cycles less - but for a script that you're intending to run essentially once, why worry about it? If this were something run many times, I'd agree but it's not.

QuoteThe default example doesn't really explain why the index array in "id_articles" also includes "id_categories" as an id key.

What default example?

QuoteThis almost looks like a Foreign Key implementation to me, but I don't know a ton about myisam in general so I couldn't say. What do you all think?

It's an SMF convention to have foreign keys listed like that because for many years we didn't have proper foreign keys.

Glyph

#2
Quote from: Arantor on September 21, 2016, 05:01:49 PM
Quotebut the array is never unset or destroyed afterwards?

All variables will automatically be up for garbage collection once they fall out of scope.
So this is an SMF specific feature? What happens when someone tries to use "columns[]" again in another area of the same script, it will start adding indexes to that array (working as intended) but will not allow re-use of that variable unless unset. For example alot of the smcFunc's use the same scheme and re-using the columns array would be beneficial right?

Quote from: Arantor on September 21, 2016, 05:01:49 PM
QuoteI figured a multi-dimensional array would fit better.

It is a multi-dimensional array, it just happens to declare it one at a time and I strongly suspect this was borrowed from the one we built for SimpleDesk years ago, it looks kinda like that. But since you need to pass in one column at a time, it looks a little bit nicer to declare one at a time in terms of reading the code afterwards.

Outright declaring it as a single massive array doesn't use less memory, it uses a few cycles less - but for a script that you're intending to run essentially once, why worry about it? If this were something run many times, I'd agree but it's not.
This is true, what I meant was a mult-multi-dimensional array lol. In this case the array only allows for a single column to be declared, with another added dimension it will allow for more than one column to be inserted at once in a single function call.

Example pseudocode:

/**
*
*This is so we can create our tables to adhere to normalization standards.
*/
db_extend('packages');
global $smcFunc;
/**
*
*Let's make sure we have some database normalization.
*We are adding win/losses/total tables for each GW2 class.
*mesmer, warrior, guardian, elementalist, revenant, thief, ranger, engineer, necromancer
*
*Formatted as such:
*
*Main members table = {db_prefix}members
* id_member mediumint(8) = Primary Key
* real_name will be changed to Guild Wars 2 name.
*
*
*
* Tables effected: 9
* Class_Total_Win, Class_Total_Loss, Class_Ranked_Win, Class_Ranked_Loss, Class_Unranked_Win, Class_Unranked_Loss
* mesmer int(10)
* ... int(10)
* necromancer int(10)
*
*/

$columns[] = array(
'table_name' => 'Class_Total_Win',
'columns' => array(
array(
'name' => 'id_totalwin',
'type' => 'int',
'size' => 10,
'unsigned' => true,
'auto' => true,
),
array(
'name' => 'mesmer',
'type' => 'int',
'size' => 10,
'unsigned' => true,
),
array(
'name' => 'warrior',
'type' => 'int',
'size' => 10,
'unsigned' => true,
),
),
'indexes' => array(
array(
'columns' => array('id_totalwin'),
'type' => 'primary',
),
),
);

/**
*
*Let's iterate through the indexed array, creating the tables we outlined above.
*/
foreach($columns as $parameter){
//I chose overwrite to make things much easier for resetting/fixing your install.
$smcFunc['db_create_table'] ($parameter['table_name'], $parameter['columns'], $parameter['indexes'], array(), 'overwrite')
}

unset ($columns); //garbage collection; we're done using this now until the next function.


Quote from: Arantor on September 21, 2016, 05:01:49 PM
QuoteThe default example doesn't really explain why the index array in "id_articles" also includes "id_categories" as an id key.

What default example?
This one seen here. In the $indexes array of the example you'll see the first index setting a primary and "secondary" key for lack of a better term. and in the second index it only defines the primary.

Which you pretty much explained in your reply:
QuoteIt's an SMF convention to have foreign keys listed like that because for many years we didn't have proper foreign keys.


p.s. Thanks for your help. I am fortunate to have the opportunity to pick your brain about this stuff!
Personal TODO:

Glyph

Also regarding the pseudo code I kind of cleaned it up but now it's not working at all, and I don't see why not.

I tried the example smcFunc['db_create_table'] and the var_dump is identical in structure to mine.. so I don't see why it's giving me a database error.

$class_array = array(
1 => 'mesmer',
2 => 'warrior',
3 => 'guardian',
4 => 'elementalist',
5 => 'revenant',
6 => 'thief',
7 => 'ranger',
8 => 'engineer',
9 => 'necromancer',
);

$columns = array(
array(
'name' => 'id_totalwin',
'type' => 'int',
'size' => 10,
'unsigned' => true,
'auto' => true,
),
);

for ($x = 1; $x <= 9; $x++){
    $columns[$x]= array(
'name' => $class_array[$x],
'type' => 'int',
'size' => 10,
'unsigned' => true);
}

$indexes = array(
array(
'type' => 'primary',
'columns' => array('id_totalwin')
),
);

$smcFunc['db_create_table']('gw2pvp_total_win', $columns, $indexes, array(), 'overwrite');


QuoteDatabase Error

Please try again. If you come back to this error screen, report the error to an administrator.
Back
Personal TODO:

Glyph

#4
In case anyones curious, I fixed this. I was somehow uploading the wrong file and i kept thinking i was getting errors. (facepalm)
Personal TODO:

Arantor

QuoteSo this is an SMF specific feature?

No, it's a language feature. You're assuming $columns stays in scope a lot longer than it actually does. In just about every single use that the package manager gets (which is where 99.9998% of the database structure changing occurs), it runs a script for the mod to set up the database and doesn't re-run. This is a job that happens literally only once for most users, and as such trying to 'optimise' it is a waste of time.

QuoteIn this case the array only allows for a single column to be declared

Because add-column only allows for one column to be added at a time.

QuoteIn the $indexes array of the example you'll see the first index setting a primary and "secondary" key for lack of a better term. and in the second index it only defines the primary.

This is because there are times you create non-primary indexes as well for performance. In this case the index on category would be used during a query that fetches all the articles in a category, something the primary key would not cover.

Glyph

#6
sharing my code with anyone who would like to re-use this in their plugins for database creation! Feel free to hack it as you see fit. (or even improve on it, just share it with me if you do please!)

$class_array = array(
1 => 'mesmer',
2 => 'warrior',
3 => 'guardian',
4 => 'elementalist',
5 => 'revenant',
6 => 'thief',
7 => 'ranger',
8 => 'engineer',
9 => 'necromancer',
);
$table_array = array(
1 => 'totalwin',
2 => 'totalloss',
3 => 'rankedwin',
4 => 'rankedloss',
5 => 'urankedwin',
6 => 'urankedloss',
);
/**
*
*Create the tables and columns.
* 54 operations in total! (and only ~26 lines!)
*/
for ($y = 1; $y <= 6; $y++){
$columns = array(
array(
'name' => 'id_'.$table_array[$y],
'type' => 'int',
'size' => 10,
'unsigned' => true,
'auto' => true,
),
);
for ($x = 1; $x <= 9; $x++){
$columns[$x] = array(
'name' => $class_array[$x],
'type' => 'int',
'size' => 10,
'unsigned' => true
);
}
$indexes = array(
array(
'type' => 'primary',
'columns' => array('id_'.$table_array[$y])
),
);
$smcFunc['db_create_table']('{db_prefix}myplugin_'.$table_array[$y], $columns, $indexes, array(), 'overwrite');
}
Personal TODO:

Arantor

Except no one makes a bunch of identically structured tables in a loop. And trying to optimise a job you only do once seems overkill.

Line count ain't everything.

Glyph

#8
Quote from: Arantor on September 23, 2016, 02:12:22 AM
Except no one makes a bunch of identically structured tables in a loop. And trying to optimise a job you only do once seems overkill.

Line count ain't everything.

True and thinking about it I guess using direct PHP MySQL functions would be alot more efficient; but 500 lines of unreadable repeating code is a real eye sore and this is the best database structure I could come up with for this type of project.

9 classes with 6 categories and creating a structured standard based around that is eh... there has got to be someone else that has a use for this[1]; maybe not for this specific application but for something similar. isn't it nice knowing the plugin you're running is top-of-the-line? It shows that someone actually spent alot of their time making sure everything was done correctly and that in turn creates confidence in whoever is using it. Short of being as efficient as possible; it's much easier for other developers or downloaders to modify which means they are more likely to recommend it.

I've seen a few mods I was interested in helping with but shyed away from because I automatically knew it would mean rewriting half of it from scratch. Convenience: I think that's the key.

Do you know of a better way to structure a database like this? (question for anyone willing to lend their time)


[1]This mod could certainly use this: http://custom.simplemachines.org/mods/index.php?mod=4001 In install.php it kinda uses the same code already actually.
Personal TODO:

Arantor

Yours is the first such use I've seen for this in the 10 years I've been running SMF where you have multiple tables with all the same structure. Even the example you point to doesn't have as similar a structure as yours.

Though without knowing what you're doing for sure it would be hard to recommend an improvement. Also, please do not ever attempt to use this on plugins people will actually use, they don't like it when their data goes missing between reinstalls which is what will happen here.

I have little doubt you'll suggest the installer I use is "inefficient" but then again I write stuff that I can maintain years later, extend years later without having to untangle "clever" code and it won't break if people installed earlier versions or reinstall on the same setup. Check out SimpleDesk on the mod site, and pay careful attention to, for example, the section where it talks about working around issues that reared up in SMF 2.0 RC5.

Glyph

Quote
they don't like it when their data goes missing between reinstalls which is what will happen here.

Good call. I personally prefer overwriting but it's probably not the best practice. I never really put much thought into that; thank you.

QuoteI have little doubt you'll suggest the installer I use is "inefficient" but then again I write stuff that I can maintain years later, extend years later without having to untangle "clever" code and it won't break if people installed earlier versions or reinstall on the same setup. Check out SimpleDesk on the mod site, and pay careful attention to, for example, the section where it talks about working around issues that reared up in SMF 2.0 RC5.

I'm just trying to wrap my head around why my code isn't the best with the information i'm given. In an analogy it's like making a cake by looks but having no mouth to eat it with. It may look great but in practice it may not taste good. (not the best analogy here but the point remains)

The simpledesk installer looks amazing; especially the {db_prefix}helpdesk_tickets array. not only that but the ternary operators are nice. I could oogle at this all day in this post but... on to your installer.

The installer i'm looking at is smf-characters. I think yours is not bad at all. Very organized and easy to read. the db_field function is especially nice. I assume the "db_query" is to retain compatibility? The varchar 255's are a bit excessive; there may very well be a reason and you certainly don't owe me an explanation.  My post here is to try to create a function that is a sort of perfect fit for my application. Which you are contributing towards greatly by the way; and I can't thank you enough.

I'm having trouble finding the issue with simpledesk in rc5; however I can think of a bunch of ways there could be problems between SMF RC releases. Changed function names, different function "structure", blah blha blha, but with db_query dealing with mysql directly i can see where it retains its endurance.
Personal TODO:

Arantor

Ok, so let me step back a minute.

It used to be the case in SMF that you could call its create table function on a table that already existed, and it would add in any column that didn't already exist. Sometime around RC5 - it's been a few years, I forget the details of how and why - the decision was made that if a table already existed, SMF's create table library shouldn't attempt to create the table automatically.

For how SimpleDesk and everything else worked, the original behaviour was fine - add in missing columns on an update so if someone comes from an older version to a newer version of the mod, the structure would be upgraded to match. So I take the table definitions and fill that into the create column routine. This is the one place I've seen where array reuse becomes useful.

As to db_field, that's simply a helper function that outputs the style of array SMF's functions expect. Most of the time, I don't need to duplicate all the parameters since most of the time I don't really care. For example, int(3) and int(10) are the same thing, and what determines the column size is the data type itself (int vs tiny int) so I don't care to have to put a size in anyway, so db_field does it for me.

As to varchar 255, that's mostly about habit at this stage, SMF does various re-encoding of characters to entity form, and to avoid truncation of entities I typically allow more space for safety. Worse case, a single character could consume 10 character spaces in a string owing to entity encoding.

My objection mostly stems from "if you have 54 tables of the same structure, something is fundamentally wrong in your design", as opposed to one table with a column to indicate which GW2 class you're dealing with and one for the category within it, but I don't know what you're trying to do beyond you having these tables and optimising the effort of creating 54 tables when my gut instinct says you probably need only one.

I don't think we'll see another class in GW2 any time soon though, so that's probably a safe bet for structure, but imagine you'd built this prior to HoT, back before the Revenant was introduced... The way I see it, you'd have one column indicating class and you'd just have rows with a new value beyond that.

Glyph

#12
Quote from: Arantor on September 23, 2016, 06:21:38 AM
Ok, so let me step back a minute.

It used to be the case in SMF that you could call its create table function on a table that already existed, and it would add in any column that didn't already exist. Sometime around RC5 - it's been a few years, I forget the details of how and why - the decision was made that if a table already existed, SMF's create table library shouldn't attempt to create the table automatically.

For how SimpleDesk and everything else worked, the original behaviour was fine - add in missing columns on an update so if someone comes from an older version to a newer version of the mod, the structure would be upgraded to match. So I take the table definitions and fill that into the create column routine. This is the one place I've seen where array reuse becomes useful.

As to db_field, that's simply a helper function that outputs the style of array SMF's functions expect. Most of the time, I don't need to duplicate all the parameters since most of the time I don't really care. For example, int(3) and int(10) are the same thing, and what determines the column size is the data type itself (int vs tiny int) so I don't care to have to put a size in anyway, so db_field does it for me.

As to varchar 255, that's mostly about habit at this stage, SMF does various re-encoding of characters to entity form, and to avoid truncation of entities I typically allow more space for safety. Worse case, a single character could consume 10 character spaces in a string owing to entity encoding.

My objection mostly stems from "if you have 54 tables of the same structure, something is fundamentally wrong in your design", as opposed to one table with a column to indicate which GW2 class you're dealing with and one for the category within it, but I don't know what you're trying to do beyond you having these tables and optimising the effort of creating 54 tables when my gut instinct says you probably need only one.

I don't think we'll see another class in GW2 any time soon though, so that's probably a safe bet for structure, but imagine you'd built this prior to HoT, back before the Revenant was introduced... The way I see it, you'd have one column indicating class and you'd just have rows with a new value beyond that.
As for them adding another class I agree it probably won't happen for a while; however it is a possibility. I'de like to make it so if it ever does change I can easily fix it.

I will be pooling data on wins/losses much like https://www.gw2pvp.de/web/index.php?action=overallstats
and https://www.gw2pvp.de/web/index.php?action=leaderboards

The pooling from stats also will be used for other stuff like website members general rankings. and also being used specifically for each member, for say their profile, or whatever membergroup they are in.
Personal TODO:

Glyph

#13
With the help of a personal friend we were able to devise an optimal database scheme; although still a WIP we plan on using secondary keys for further linkage.

Attached is the mysql workbench file, and a screenshot of the database schema. If anyone wants to criticize this please do; i'm open for any suggestions.

I do realize that not all databases are innodb, and that i should probably use myisam; however changing my settings in mysql workbench crashes the application lol...
Personal TODO:

Arantor

What exactly are you planning on storing in all those columns... whether the user is one? Whether the user has x many of that character class?

Surely the structure seems to be to have a table called characters, which points back to the smf_members table to say which member owns that character, then that table contains details about each character including one column only to indicate which character class the character is. Then if you want to track wins/losses per character, they are properties of a character and thus new columns on that characters table.

Data modelling is a hard skill. It requires you to focus on what data you have and what you want to express with it... in this case we have the entity that is an SMF member. We have the entity that is a GW2 character. Whether it is a necromancer character is not a discrete property of that character - but which class overall it is, is a property of the character and thus belongs as a discrete column in its own right. If you wanted to simplify that, you could create a new table listing the classes and link the two tables.

The other bonus to this, if ever we do get a new class, it becomes a single row in a table to add.

Advertisement: