Syntax errors from SMF database backup.

Started by Antechinus, April 14, 2009, 06:13:29 PM

Previous topic - Next topic

Norv

Me? :D
Linux Debian squeeze, Mandriva, Windows XP, (ordered by most used!), and I has VMs with Ubuntu, Dreamlinux, and a couple of forensic less known distributions...  lately O:)
To-do lists are for deferral. The more things you write down the later they're done... until you have 100s of lists of things you don't do.

File a security report | Developers' Blog | Bug Tracker


Also known as Norv on D* | Norv N. on G+ | Norv on Github

Antechinus

Ok then. The local I'm using is a bog standard Vertrigo install, so you can bung one of those on your XP and have ed zachary the same local host I have. Then you can play with it. :)

GravuTrad

On a toujours besoin d'un plus petit que soi! (Petit!Petit!)


Think about Search function before posting.
Pensez à la fonction Recherche avant de poster.

Antechinus


emanuele



Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Antechinus

Dunno. Haven't checked lately as I stopped using SMF admin and phpMyAdmin for backups yonks ago. I can do a test run sometime this weekend though. From memory, it turned out to be a phpMyAdmin bug but I'm not sure which versions of phpMyAdmin are affected.

Story was that the affected versions exported with apostrophes escaped as per MSSQL syntax (two single apostrophes, instead of backslash - apostrophe) even when the db was MySQL, and phpMyAdmin insisted that any apostrophes be escaped as in MSSQL when importing. Since the db was actually MySQL this resulted in a crapload of syntax errors if you took a backup made via another utility (like SMF admin) and then attempted to import it via phpMyAdmin.

I'm quite sure this was behind a lot of the problems people experienced when attempting to restore backups made from SMF admin, since the usual recommendation was to import them via phpMyAdmin. Without manually changing the escape syntax for all apostrophes in the backup, what happens is that there would be dropped content all over the place when viewing the forum, eve though all content would be in the db itself.

MrPhil

Per other backup-related discussions, it would probably be best to simply get rid of SMF's admin backup function, given all the flaws it has, and encourage users to make use of phpMyAdmin, etc. to both back up and restore. However, it sounds like some users don't have easy access to phpMyAdmin, so it would be good to provide a "possibly separate" utility (password controlled?) to back up and restore the database.

Since no database may be available for the restore, the conventional SMF ID/password system isn't going to work. What may be feasible is to have the forum owner hard code a clear text password in the PHP code. The utility should refuse to run if the password hasn't been changed. Or, there could be a function to generate a hash from a selected password, and the user edits that into the PHP code. Or, the hash might be stored in an included file, and the hash-generation function disabled if that file exists. There are all sorts of ways that we could keep hackers from invoking the database backup and/or restore via the Web.

This brings up the matter under discussion that different backups apparently have different conventions for comments, embedded quotes, and the like. We may have to offer a switch on this backup utility to handle these things in different ways (\' vs '', etc.) depending on the intended restore method or database used. The switch setting should be in a comment at the very top of the file so that the SMF restore utility knows what to look for.

While we're at it, an SMF backup utility should offer ways to break up both the backup and the restore into smaller pieces, so jobs aren't killed due to timeouts. And of course, all the flaws in the current backup need to be fixed. There might be a utility to convert between different database conventions, should someone want to restore in MySQL/phpMyAdmin a database originally formatted for MSSQL or whatever. Anything else?

emanuele

Just tested this on my localhost and it works properly (it also allowed me to fix a couple of bugs in the new export code :P): exported with SMF and imported with phpMyAdmin without any error.

Quote from: MrPhil on May 17, 2012, 12:20:32 PM
Per other backup-related discussions, it would probably be best to simply get rid of SMF's admin backup function, given all the flaws it has, and encourage users to make use of phpMyAdmin, etc. to both back up and restore.
As per all the other database backup discussions: in 2.1 (unless otherwise decided) the SMF database export will work as expected (provided enough people will test it before 2.1 will be out).

Quote from: MrPhil on May 17, 2012, 12:20:32 PM
However, it sounds like some users don't have easy access to phpMyAdmin, so it would be good to provide a "possibly separate" utility (password controlled?) to back up and restore the database.

Since no database may be available for the restore, the conventional SMF ID/password system isn't going to work. What may be feasible is to have the forum owner hard code a clear text password in the PHP code. The utility should refuse to run if the password hasn't been changed. Or, there could be a function to generate a hash from a selected password, and the user edits that into the PHP code. Or, the hash might be stored in an included file, and the hash-generation function disabled if that file exists. There are all sorts of ways that we could keep hackers from invoking the database backup and/or restore via the Web.
As far as I remember phpMyAdmin is a php script that can be downloaded from a website and installed on any host just like SMF (well, mostly).

Quote from: MrPhil on May 17, 2012, 12:20:32 PM
This brings up the matter under discussion that different backups apparently have different conventions for comments, embedded quotes, and the like. We may have to offer a switch on this backup utility to handle these things in different ways (\' vs '', etc.) depending on the intended restore method or database used. The switch setting should be in a comment at the very top of the file so that the SMF restore utility knows what to look for.
The conventions are pretty straightforward: the standard SQL method to escape single quotes is a double single-quote.
MySQL extends the standard and allows both the double single-quote and the slash+single quote (where the preferred is probably the \', but that's just a guess).
That is why I find very strange this problem, because AFAIK is MySQL itself the only tool that uses this kind of escape.
A guess would be that Ant's host sets some weird compatibility mode or so...

Quote from: MrPhil on May 17, 2012, 12:20:32 PM
While we're at it, an SMF backup utility should offer ways to break up both the backup and the restore into smaller pieces, so jobs aren't killed due to timeouts.
If you mind to test the dev-snapshots you would notice the timeouts don't happen any more.
Additionally how I already explained split would be useless for several reasons:
1) you need to store the database on the publicly accessible space of the host (a writable directory that most likely would be in the vast majority of the cases cache or in the most lucky attachments),
2) breaking the database would take much more time to actually complete the backup, that would mean the users shall put the forum in maintenance otherwise the backup will lack of consistency,
3) the code to break things up needs to be more complex,
4) you will have to explain to the users that they have to strictly follow the order to re-upload the files, otherwise the consistency would go to...

Quote from: MrPhil on May 17, 2012, 12:20:32 PM
And of course, all the flaws in the current backup need to be fixed. There might be a utility to convert between different database conventions, should someone want to restore in MySQL/phpMyAdmin a database originally formatted for MSSQL or whatever. Anything else?
Convert from a backup is pretty difficult/annoying.
Additionally many conversions don't make much sense: basically any conversion to SQLite is pretty much useless, and I don't think there are many people that would switch from MySQL to Postgres or vice versa...
Also SQLite to Postgres is unlikely...I'd guess.


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

MrPhil

Quote from: emanuele on May 23, 2012, 07:06:07 PM
As far as I remember phpMyAdmin is a php script that can be downloaded from a website and installed on any host just like SMF (well, mostly).
A number of people have complained that they do not have phpMyAdmin (or a similar utility) on their systems. If their hosts are unwilling to install phpMyAdmin, what chance do they have to install it themselves? Just saying...

Quote
The conventions are pretty straightforward: the standard SQL method to escape single quotes is a double single-quote.
MySQL extends the standard and allows both the double single-quote and the slash+single quote (where the preferred is probably the \', but that's just a guess).
That is why I find very strange this problem, because AFAIK is MySQL itself the only tool that uses this kind of escape.
I guess we need to decide what kind of cross-database/cross-utility capability is needed for our backups. Conceivably, a backup made on MySQL could end up being restored on PostgreSQL, etc. If there's any mode which is universally acceptable, let's use it. Otherwise, we'll need to be able to import/export "foreign" formats.

Quote
Quote from: MrPhil on May 17, 2012, 12:20:32 PM
While we're at it, an SMF backup utility should offer ways to break up both the backup and the restore into smaller pieces, so jobs aren't killed due to timeouts.
If you mind to test the dev-snapshots you would notice the timeouts don't happen any more.
Is everyone satisfied that timeouts won't happen on any host? Not just the one or two you tested your changes on. It's now impossible to get a timeout on either backup or restore on any "reasonable" host? Even if the database is absolutely humungous? Even if a standard utility like phpMyAdmin is used instead of SMF's backup/restore?

Quote
Additionally how I already explained split would be useless for several reasons:
1) you need to store the database on the publicly accessible space of the host (a writable directory that most likely would be in the vast majority of the cases cache or in the most lucky attachments),
I don't get it. What's any different about writing multiple .sql files with different names, than a single .sql file? You've got to write it somewhere.

Quote
2) breaking the database would take much more time to actually complete the backup, that would mean the users shall put the forum in maintenance otherwise the backup will lack of consistency,
Well, yes, writing multiple files probably will take a bit longer than writing a single file, but not "much more". A forum should always be in maintenance mode during a backup or restore, and it should be done automatically (or, refuse to run unless the user has already put it in maintenance mode). Otherwise you end up with inconsistencies.

Quote
3) the code to break things up needs to be more complex,
Say, every 1000 records (or some other user-configurable value) increment a file counter and start a new output file. The last file could have some sort of marker in a comment so that the SMF restore utility knows it's the end. It's doesn't seem to be much more complex to me.

Quote
4) you will have to explain to the users that they have to strictly follow the order to re-upload the files, otherwise the consistency would go to...
The consistency would go to vB? :) Put all the table DROPs and CREATEs in the first backup file, and then after that it doesn't matter what order they're done in. With numbered backup files, you could automatically import them in sequence in a restore utility. Using phpMyAdmin, the user would have to watch the sequence numbers to make sure they do "1" first. After that, it doesn't matter.

I'm not sure we're talking about the same thing when we say "split the backup". What I'm meaning is "back up the database into multiple, relatively small .sql files so that neither the backup nor restore (nor the file transfers) time out. This may involve playing some games so that a fresh process is started for the next .sql file -- maybe holding the last .sql file number done, and (for backup) its table and its ending record number in a scratch file or something. That might be necessary to avoid a timeout. Start with some large number of records to process per file, and if the user experiences timeouts they can adjust this record count down to where the backup and restore processes manage to run OK. Feasible?

Quote
Convert from a backup is pretty difficult/annoying.
Additionally many conversions don't make much sense: basically any conversion to SQLite is pretty much useless, and I don't think there are many people that would switch from MySQL to Postgres or vice versa...
Also SQLite to Postgres is unlikely...I'd guess.
As mentioned before, if there's a universal format we should use that. Otherwise, foreign format import/export for SMF's utility should be supported, and we can think about converting between .sql file formats if necessary. After all, sooner or later someone is going to decide they want to move to another host that doesn't support, say, MySQL, and all they have is a MySQL backup. SMF looks much better if we can say, "No Problem!"

Arantor

QuoteIf their hosts are unwilling to install phpMyAdmin, what chance do they have to install it themselves? Just saying...

As much chance as installing SMF or WordPress. Honestly, it is no more difficult to install phpMyAdmin manually.

QuoteIs everyone satisfied that timeouts won't happen on any host? Not just the one or two you tested your changes on. It's now impossible to get a timeout on either backup or restore on any "reasonable" host? Even if the database is absolutely humungous? Even if a standard utility like phpMyAdmin is used instead of SMF's backup/restore?

The answer is no. There is no guarantee that timeouts won't occur. For example, if you tried to download the backup of this server, it WILL time out. Guaranteed. Then again, when you have literally millions of posts you shouldn't be using it, nor should you be using phpMyAdmin either.

The bigger picture that some seem to be unwilling to grasp is that doing it in SMF is a bad idea, and even phpMyAdmin will choke on larger stuff, but if you're hitting that limit, 1) you're on a VPS or a REALLY bad host and 2) you should probably take the time, if you haven't already, to learn how to do it properly or get someone competent to do it.

Take me for example. My VPS does not have phpMyAdmin installed (through my choice), and I run quite a few forums on that server. I don't use SMF's admin backup, I have a script that does it for me, especially as in two cases even phpMyAdmin is going to choke, and no amount of SMF magic would fix that.

QuoteI don't get it. What's any different about writing multiple .sql files with different names, than a single .sql file? You've got to write it somewhere.

That's the point: the current backup as in SMF 2.0 does not use a file at all. I haven't looked at how 2.1 does it but I'm assuming an exclusively-locked file in the temporary area.

Doing it with files of any kind is risky, doing it with multiple files is even more so. Remember: the files will be owned by the webserver and no amount of 644 will save you.


The bottom line is that, for the most part, dealing with this is a rabbit hole, and I would argue there are many better things to be doing with the development time than fighting a losing battle, wherein there are actually more scenarios that the backup will NOT be suitable for than the number that it will, as compared to other solutions.

FWIW, I removed it from Wedge months ago, instead preferring to spend my time directing users to better-equipped tools for the job. Something about not reinventing the wheel and avoiding not-invented-here syndrome.
Holder of controversial views, all of which my own.


GravuTrad

On a toujours besoin d'un plus petit que soi! (Petit!Petit!)


Think about Search function before posting.
Pensez à la fonction Recherche avant de poster.

Antechinus

Quote from: emanuele on May 23, 2012, 07:06:07 PM
Just tested this on my localhost and it works properly (it also allowed me to fix a couple of bugs in the new export code :P): exported with SMF and imported with phpMyAdmin without any error.

How did you test the results? I know it will import without throwing any errors in phpMyAdmin. That's part of the trap. The problem is/was that although the actual import process would complete without any apparent problems, when viewing the post content of the restored db, as actually displayed on the forum, there would be large and random chunks of data missing, even though the same data was present in the db if you took the time to look for it.

emanuele

Ohhh...but that's could really be another problem then!
That smells a *lot* of encoding issue...


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Antechinus

I'd try a quick export from phpMyAdmin. One table should do the trick, as long as it has apostrophes. The calendar holidays table is a good test for this as it's short and near the top. If it exports from phpMyAdmin with apostrophes escaped as two singles ( '' rather than \' ) then I'd say the problem is still there.

emanuele

Quote from: MrPhil on May 23, 2012, 08:51:56 PM
Is everyone satisfied that timeouts won't happen on any host? Not just the one or two you tested your changes on. It's now impossible to get a timeout on either backup or restore on any "reasonable" host? Even if the database is absolutely humungous? Even if a standard utility like phpMyAdmin is used instead of SMF's backup/restore?
I sure SMF doesn't work on each and every host out there. ;)
SMF cannot restore backups and that's a piece of code I'm not interested in, so the problem in that sense doesn't exist.
That said, of course, I cannot guarantee anything, but at least it wont timeout as it does now.

Quote from: MrPhil on May 23, 2012, 08:51:56 PM
Quote
4) you will have to explain to the users that they have to strictly follow the order to re-upload the files, otherwise the consistency would go to...
The consistency would go to vB? :) Put all the table DROPs and CREATEs in the first backup file, and then after that it doesn't matter what order they're done in. With numbered backup files, you could automatically import them in sequence in a restore utility. Using phpMyAdmin, the user would have to watch the sequence numbers to make sure they do "1" first. After that, it doesn't matter.
As far as I remember SMF relies on data order for the board index and for smiley (the two I remember at the moment), so if for example the import queries of the boards table are split over two different files the result will be a messed up board index.

Quote from: MrPhil on May 23, 2012, 08:51:56 PM
I'm not sure we're talking about the same thing when we say "split the backup". What I'm meaning is "back up the database into multiple, relatively small .sql files so that neither the backup nor restore (nor the file transfers) time out. This may involve playing some games so that a fresh process is started for the next .sql file -- maybe holding the last .sql file number done, and (for backup) its table and its ending record number in a scratch file or something. That might be necessary to avoid a timeout. Start with some large number of records to process per file, and if the user experiences timeouts they can adjust this record count down to where the backup and restore processes manage to run OK. Feasible?
We are talking about exactly the same thing...and I don't like it so much.
It's a hack and rather than this I'd wipe out the backup (and wipe out in my dictionary means remove it completely and not provide any alternative at all, except for: "use phpMyAdmin").

Quote from: MrPhil on May 23, 2012, 08:51:56 PM
Quote
Convert from a backup is pretty difficult/annoying.
Additionally many conversions don't make much sense: basically any conversion to SQLite is pretty much useless, and I don't think there are many people that would switch from MySQL to Postgres or vice versa...
Also SQLite to Postgres is unlikely...I'd guess.
As mentioned before, if there's a universal format we should use that. Otherwise, foreign format import/export for SMF's utility should be supported, and we can think about converting between .sql file formats if necessary. After all, sooner or later someone is going to decide they want to move to another host that doesn't support, say, MySQL, and all they have is a MySQL backup. SMF looks much better if we can say, "No Problem!"
As said, the only one that uses a different format AFAIK is MySQL.
The other DBMS should use the same standard (but honestly I never tried to import a SQLite into a postgre db...will try sooner or later! :D).

Quote from: Arantor on May 23, 2012, 10:05:53 PM
The bigger picture that some seem to be unwilling to grasp is that doing it in SMF is a bad idea, and even phpMyAdmin will choke on larger stuff, but if you're hitting that limit, 1) you're on a VPS or a REALLY bad host and 2) you should probably take the time, if you haven't already, to learn how to do it properly or get someone competent to do it.
* emanuele dreams a world where people read instructions manuals...
unfortunately I'm the first that doesn't read any manual... lol

Quote from: Arantor on May 23, 2012, 10:05:53 PM
Take me for example. My VPS does not have phpMyAdmin installed (through my choice), and I run quite a few forums on that server. I don't use SMF's admin backup, I have a script that does it for me, especially as in two cases even phpMyAdmin is going to choke, and no amount of SMF magic would fix that.
I started my web experience with wordpress and so I'm used to do backups from phpMyAdmin (the only tool available at my host), so I didn't even know there was a tool to backup the database in SMF until recently... lol

Quote from: Arantor on May 23, 2012, 10:05:53 PM
That's the point: the current backup as in SMF 2.0 does not use a file at all. I haven't looked at how 2.1 does it but I'm assuming an exclusively-locked file in the temporary area.
* emanuele copied phpMyAdmin...
That's all.
Dropped the output buffer, used (also) set_time_limit to gain time (again during the process), etc.
Of course it's not bullet proof, that why I added few checks on the admin page to explain the admin what's the best approach to take (including disable the button for large databases).

Quote from: Arantor on May 23, 2012, 10:05:53 PM
The bottom line is that, for the most part, dealing with this is a rabbit hole, and I would argue there are many better things to be doing with the development time than fighting a losing battle, wherein there are actually more scenarios that the backup will NOT be suitable for than the number that it will, as compared to other solutions.
That was somehow funny.
I usually don't touch things I don't like... :P


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

emanuele

Quote from: Antechinus on May 24, 2012, 06:10:55 PM
I'd try a quick export from phpMyAdmin. One table should do the trick, as long as it has apostrophes. The calendar holidays table is a good test for this as it's short and near the top. If it exports from phpMyAdmin with apostrophes escaped as two singles ( '' rather than \' ) then I'd say the problem is still there.
In my system phpMyAdmin exports with '' and imports with both '' and \' without any problem.


Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Antechinus

Define "without any problem", bearing in mind what I said earlier. :D

emanuele



Take a peek at what I'm doing! ;D




Hai bisogno di supporto in Italiano?

Aiutateci ad aiutarvi: spiegate bene il vostro problema: no, "non funziona" non è una spiegazione!!
1) Cosa fai,
2) cosa ti aspetti,
3) cosa ottieni.

Antechinus

Ok that's looking pretty good. phpMyAdmin may have fixed the bug.

GravuTrad

On a toujours besoin d'un plus petit que soi! (Petit!Petit!)


Think about Search function before posting.
Pensez à la fonction Recherche avant de poster.

Advertisement: