Database Changes in SMF 2.0 - Part 2

Started by Grudge, January 11, 2008, 10:57:12 AM

Previous topic - Next topic

Grudge

Righty,

Thought that with a few minutes time on my hand I'd make a bit of a post about the recent changes we've made to 2.0 – specifically to the database engine (99% of the forum population may wish to turn away now to save themselves five minutes of boredom).

Right, I'll start with the motivation behind this. There are two real motivations, one of them is security and the other improved abstraction. In my mind security has been the real driver for this and as such I'll explain why. You see, there are two main ways of someone "exploiting" web based software (Or certainly we've had no others ever affect SMF), they are:
1) SQL Injection attacks – where through some means someone is able to run arbitrary SQL commands on your box, revealing passwords, destroying data etc.
2) XSS (Cross Site Scripting) – where through injection of javascript (Or otherwise) a user may be able to steal your session and "take over" your account.

SMF obviously has always protected against both of these and continues to do so. Whilst both can have major implications in general SQL attacks are the ones that, in my mind, are the cause for most concern. So, how did SMF used to protect against these, well – in two ways:
1) Strings are "escaped" (the means by which variables are made safe) before they are inserted into a database. As the normal source for an injection is malicious data coming into the script from a URL/Form etc SMF used to take the approach of "escape everything" whereby anything coming into the script was made safe – and we'd only make it "dangerous" if we needed to do some manipulation of it – and would make it safe again afterwards.
2) The database query function did some additional checks to ensure if someone *did* manage to exploit SMF that they were limited into the extent of the damage they could cause. This is what I'd call the "second line" of defence – or damage limitation.

When developing 2.0 we'd started to become concerned about our approach to the first of these protection mechanisms. SMF 2.0 has generally become cleverer, more complex. We do more things with incoming variables before we put them in the database and in general that meant making them dangerous more and more often. Every time you make them "dangerous" there is a risk that you forget to make them safe again. We decided that this approach was no longer appropriate as it was beginning to increase as opposed to decrease the security risks in 2.0. So – like all good developers – we changed it.

Compuart, the clever chap that he is, came up with a new scheme for doing this security. The problem with our old security model was all the security was on the "gates". Once something was "in" SMF it was assumed to be secure – we wanted to change this. Now, instead of the security being on the gates it stands directly in front of the database itself. You can no longer talk to the database without doing through Bertie the Bouncer (Or whatever he is called). To get something into (Or for that matter out of) the database you need to tell it exactly what it is you're doing – and he'll make it secure for you.

So, now if I want to set my name to 'Grudge – King of the Newts' when I send it to the database I also tell it 'It's a string'. If it's not a string security won't let it if. Also, security – knowing how dangerous strings are – will clean it up before it gives it to the database. The point is that we can do whatever we want to variables within SMF – I can feel with all kinds of dangerous things – but when it comes to feeding it into the database it takes care of security there and then. Now, I'll move on from the silly examples and go into a it of practical stuff – those who vomit at the site of a it of PHP best look away now.

So, here's the old way to update my name: (If you're confused by the $smfFunc stuff see one of my previous blogs on databases)

$smfFunc['db_query']('', "
UPDATE {$db_prefix}members
SET member_name = 'grudge'", __FILE__, __LINE__);


Now, things would look quite different:

$smfFunc['db_query']('', '
UPDATE {db_prefix}members
SET member_name = {string:name}',
array(
'name' => 'grudge',
)
);


So, let's look at things bit by bit:
1)   __FILE__ and __LINE__ are gone. We know use backtracing functions in the case of an error. It doesn't work properly on very old versions of PHP so debugging won't be quite so hot on 4.3.1 but we thought that they are in the absolute majority and the new method makes queries much cleaner.
2)   No single quotes (') within the query. In fact, if I were to put a single quote in the query – anywhere – you'd get a big error message.
3)   Instead the place the string is meant to go has an inject name which references the key of the array passed as the next parameter. When the query is processed every item in the array is reinserted into the query before being passed to the database. Variables are cast and cleaned as required. This means if you inserted an int it will be cast as an int. If you insert a string it will be escaped and surrounded by quotes.

Another examples:

$new_location = "England's Home Land";
$smfFunc['db_query']('', '
UPDATE {db_prefix}members
SET location = {string:new_location}
WHERE id_member IN ({array_int:my_friends})',
array(
'new_location' => $new_location,
'my_friends' => array(3,6,7,3),
)
);


As for the first example, the string $new_location is tested to be a string, escaped, surrounded with quotes and fills the gap of {string:new_location}. Meanwhile the other type array_int looks at the array element of my_friends, checks it's an array, check's all it's contents are ints, then explodes into a list of numbers into the query.

Of course it's much easier to see this by actually viewing the code in real examples but I'm hoping this will help. So, how will this change impact people?

Well – for "normal" users – you won't see a thing. Everything will work as usual – but you should benefit from adding security that will (hopefully) also improve the security of mods as mod authors now can rely on SMF for their database security and not have to think too much about their own protection for SQL issues.

For "mod" users things aren't too much of a problem. They simply remove (I suspect) all calls to escaping functions like addslashes etc that they currently use. They need to rewrite their queries (In addition all inserts should now use a new function db_insert but I've spoken about that before) – but it's not *that* much work as all mods need to be rewritten for 2.0. Besides which – even though Compuart wrote a great tool for converting the main SMF files I still spent about 6 days over Christmas checking, and in many cases rewriting, every query in SMF so no-one can moan to me about the length of the task as I've done it first hand ;)

The main people this will impact is people who integrate SMF deeply within their site. People who use SMF for integrations (For example including SSI.php) and then doing their own queries need to be aware that variables like $_POST and $_GET are no longer escaped by SMF. This means if you previously had this code.

$_POST['test'] = 'test\'s';
require_once('SSI.php');
mysql_query("update smf_members SET member_name = $_POST[test]");


In the past that would have worked. Now that will create a very dangerous security hole. Therefore anyone using SMF for an integration needs to make one of three decisions, either:
1)   Convert their queries to use the SMF query function (Highly recommended)
2)   Escape $_GET, $_POST type variables individually before they are used in their queries
3)   Escape all $_GET, $_POST type variables straight after including SSI.php by doing something like:

require_once('SSI.php');
$_GET = escape__recursive($_GET);
... etc for $_POST


I really wouldn't however recommend doing this. Certainly if you ever use any SMF functions *after* the latter example there is a risk that data will be "double escaped" resulting in lots of slashes appearing in posts, member names etc unintentionally.

There are a few more things about the system but I've covered off the bulk of it here. I very much believe that these changes will benefit everyone, particularly users who should always be looking for better and better security. We should be upgrading our site here this weekend so I'll soon be able to report first hand how easy or otherwise converting an integration is ;)

Grudge
I'm only a half geek really...

Deaks

i cant believe i read that, i cant also believe i understood over 50% of it, im curious when smf2 becomes stable will the ssi help pages be updated to work with the new db  strings.
~~~~
Former SMF Project Manager
Former SMF Customizer

"For as lang as hunner o us is in life, in nae wey
will we thole the Soothron tae owergang us. In truth it isna for glory, or wealth, or
honours that we fecht, but for freedom alane, that nae honest cheil gies up but wi life
itsel."

karlbenson

A very good explanatory post.
I see can it taking a bit of time to get used to, but overall it seems very positive.

LMAO, Bertie the Bouncer.

Thantos


metallica48423

are the types the same as php's resource types, or is there a list of them available?
Justin O'Leary
Ex-Project Manager
Ex-Lead Support Specialist

QuoteMicrosoft wants us to "Imagine life without walls"...
I say, "If there are no walls, who needs Windows?"


Useful Links:
Online Manual!
How to Help us Help you
Search
Settings Repair Tool

Thantos

I'll list them out

int
string
text
array_int
array_string
date
float
identifier
raw

In addition {prefix}, {query_see_board}, {query_wanna_see_board} are all magically replaced with $db_prefix, $user_info['query_see_board'], $user_info['query_wanna_see_board'] respectfully.

karlbenson

I appreciate everythings a work in progress.

Is there a way to find out whether a certain mysql command would be supported.
Obviously I want to avoid functions that are mysql only in my current mods, to avoid making things harder when it comes to update for 2.0.

Specifically the command I'm referring to is INSERT SELECT.
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html

It is supported by PostGreSql.

Thantos

AFAIK INSERT SELECT is generic SQL command that should be supported by most (if not all) database engines.  Heck even MS Access has INSERT SELECT.

metallica48423

Thanks for the Info, Thantos.

Pretty much the same as php resource types in terms of type casting.  (well, it is the same... duhh *slaps self*) :P
Justin O'Leary
Ex-Project Manager
Ex-Lead Support Specialist

QuoteMicrosoft wants us to "Imagine life without walls"...
I say, "If there are no walls, who needs Windows?"


Useful Links:
Online Manual!
How to Help us Help you
Search
Settings Repair Tool

karlbenson

Thanks. Thats good to know.
INSERT SELECT has saved me 1000's of queries in a mod i'm working on.

Dragooon

So if I got the stuff right, We will no longer be needing to use the "htmlunspecialchar" functions and other functions to secure the POST data submitted?

Thantos

Correct.  All data will be cleaned by the query function.

Dragooon

Pretty cool, Are there any commonly used commands(Which are commonly used in MySQL) which wont be supported?

SleePy

Quote2)   No single quotes (') within the query. In fact, if I were to put a single quote in the query – anywhere – you'd get a big error message.

I forgot about this while making my mods work with the latest SMF beta on my site tonight. I did a work around which I would not suggest users do, but it was a dirty fix to get the mod to install.

I plan on making my mods use the new method that the developers are working on. I like this new method. It is very clean, and infact.. {int:user_id} is much more clear than ' . $user_info['id'] . ' and infact it is nicer.
Jeremy D ~ Site Team / SMF Developer ~ GitHub Profile ~ Join us on IRC @ Libera.chat/#smf ~ Support the SMF Support team!

Dragooon

Yeah,
$smfFunc['db_query'](',',
"SELECT ID_MEMBER, realName FROM {$db_prefix}members WHERE ID_MEMBER = {int:user_id}",
array(
'user_id' => $user_info['id']
)
);

Seems pretty nice and cool.

Grudge

Quote from: Dragooon on January 12, 2008, 12:02:11 AM
So if I got the stuff right, We will no longer be needing to use the "htmlunspecialchar" functions and other functions to secure the POST data submitted?
Actually, this does nothing to change the way htmlspecialchars works. I'd have loved to do something like this but it would have required a context cleaning function so mod authors and developers still need to decide how to clean the HTML and make sure they make it secure.

As for MySQL commands. SMF 2.0 does not actually *police* commands - so you can use MySQL specific ones in queries - they will just obviously fail in PostgreSQL which would be a shame. I don't however think we will enforce mods being generic
I'm only a half geek really...

Dannii

Any changes for getting data out? All the examples were update queries...
"Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise."

christicehurst

Very impressive discussion about coding and I have no idea how you did it!
www.brisbanelionsunited.com - A forum for everyone!

Grudge

Dannii,

Selects are generally easier as they have less variables to insert, below is the query for loading a meber:

$request = $smfFunc['db_query']('', '
SELECT mem.*, IFNULL(a.id_attach, 0) AS id_attach, a.filename, a.attachment_type
FROM {db_prefix}members AS mem
LEFT JOIN {db_prefix}attachments AS a ON (a.id_member = {int:id_member})
WHERE mem.id_member = {int:id_member}
LIMIT 1',
array(
'id_member' => $id_member,
)
);


And for interest, here's an example of inserting data

$result = $smfFunc['db_insert']('ignore',
'{db_prefix}sessions',
array('session_id' => 'string', 'data' => 'string', 'last_update' => 'int'),
array($session_id, $data, time()),
array('session_id')
);
I'm only a half geek really...

Dannii

Grudge, the function will return the same data structure as before (or reference rather)?

Oooohh, $smfFunc['db_insert'], that's new!
"Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise."

Advertisement: