News:

Join the Facebook Fan Page.

Main Menu

Merge Script

Started by jimbo21, August 12, 2007, 06:41:29 PM

Previous topic - Next topic

jimbo21

Hey Guys,
Earlier in the year I noticed some merge scripts were in the works to merge two SMF forums together. Does anybody know what came of these? Is there any way to merge two SMF forums?
Just wondering.
thanks

jimbo21


H

Unfortunately they are still in development at the moment although they should be released in the future.
-H
Former Support Team Lead
                              I recommend:
Namecheap (domains)
Fastmail (e-mail)
Linode (VPS)
                             

dakotaroar

I've done this before--you can see it in action at http://talk.campusdakota.com.  I have not gone through and written it up in PHP, but I can give you the MySQL and instructions if you send me your email.

My method successfully merged users, boards, polls, events, topics and posts. I didn't bother with PMs.
The only forum for students in North and South Dakota!  Campus Dakota.

dakotaroar

#4
Update on this, since I have had a bunch of requests.  I still haven't written this up in PHP, but you can find the SQL syntax I used and instructions on how to use below.  Enjoy!

Since I keep getting requests for my method of merging forums, I'll post the whole thing here (I would rather attach it as a text file, alas).  It's similar to the one posted above.  I recommend copying it into a text file to read and do search/replace on.


UPDATE: Version 1.1 is worded a little bit better since I got some good feedback on that.

####################################README#######################################

Merge SMF version 1.1

Hi, this file was last updated by Matt Burton-Kelly on 02 March 2008.  NO GUARANTEE is made regarding its usefulness for combining the data from
two SMF forums, but it worked pretty well for me.  Likewise, NO GUARANTEE is made that this will not trash the databases you attempt to merge. 
Use at your own risk.   These queries do NOT combine everything, most notably data from any mods (such as the arcade mod), attachments,
and Personal Messages.  Users with the same username are NOT combined.

*It is suggested that you make multiple backups of everything and do this on a test server before attempting it live.
*Make sure to create an Admin account with a unique (in both forums) username before beginning this process.
*After running the SQL queries, run 'Recount all forum totals and statistics' under Admin-->Forum Maintenance.

Please read through all the instructions before attempting, and REMEMBER TO BACKUP!

Enjoy!  You can find an example of two forums merged into one at http://talk.campusdakota.com.
This script has been discussed in this topic at SMF:  http://www.simplemachines.org/community/index.php?topic=188524


YOUR SUPPORT IS APPRECIATED!  I put a lot of time into this, so if it works for you and you wish to compensate me in some way,
you can PayPal pizza money to [email protected] and support a starving grad student.

---Matt Burton-Kelly, 02 March 2008. [email protected].

Version history

Version 1.0 - Rewrote documentation more clearly (thanks JM!). (02 March 2008)
Version 1.0 - Original release (05 May 2007)
################################INSTRUCTIONS########################################
Variables
For the following variables, look through the databse tables of the destination forum (using phpMyAdmin or similar program) and locate the the highest ID for each type of variable.
For example, nmembers = the User ID of the last person to join the forum, nboards = the board ID of the most recently created board, and so on.
Make a second copy of this file and do a 'search and replace' with the values you have just obtained.  For example, if you have 10 categories in the destination forum,
the highest numbered category should have an id of '10', so you would search for 'ncategories' and replace it with '11'.  A second copy of this file is IMPORTANT!  If you replace the
variable names in this file, you will end up with things like '10 = ' under the Variable Recording Area below.

*All values must be greater for destination forum than source forum, otherwise there are overlap errors.  If you have a greater value in the source forum, then use that number+1.
*Easiest to get these values through the tables (of the source forum) in phpMyAdmin.
*Copy source tables to destination database (if not already) before doing anything.  That way there is a backup in the original database.

ncategories = highest-numbered category + 1
nboards = highest-numbered board + 1
nmembers = highest-numbered member + 1
ntopics = highest-numberd topic + 1
nmessages = highest-numberd post + 1
npolls = highest-numbered poll + 1
nevents = highest-numbered calendar event + 1


#################################USE THIS AREA TO RECORD YOUR VARIABLES####################
Merging source_ into destination_ on <date>

source_ =
destination_  =

ncategories =
nboards =
nmembers =
ntopics =
nmessages =
npolls =
nevents =
#############################COPY EVERYTHING BELOW INTO PHPMYADMIN########################

UPDATE source_members SET ID_MEMBER=ID_MEMBER+nmembers;

INSERT INTO destination_members (ID_MEMBER, memberName, dateRegistered, posts, ID_GROUP, lngfile, lastLogin, realName, instantMessages, unreadMessages, buddy_list, pm_ignore_list, messageLabels, passwd, emailAddress, personalText, gender, birthdate, websiteTitle, websiteUrl, location, ICQ, AIM, YIM, MSN, hideEmail, showOnline, timeFormat, signature, timeOffset, avatar, pm_email_notify, karmaBad, karmaGood, usertitle, notifyAnnouncements, notifyOnce, notifySendBody, notifyTypes, memberIP,    memberIP2, secretQuestion, secretAnswer, ID_THEME, is_activated, validation_code, ID_MSG_LAST_VISIT, additionalGroups, smileySet, ID_POST_GROUP, totalTimeLoggedIn, passwordSalt)
SELECT ID_MEMBER, memberName, dateRegistered, posts, ID_GROUP, lngfile, lastLogin, realName, instantMessages, unreadMessages, buddy_list, pm_ignore_list, messageLabels, passwd, emailAddress, personalText, gender, birthdate, websiteTitle, websiteUrl, location, ICQ, AIM, YIM, MSN, hideEmail, showOnline, timeFormat, signature, timeOffset, avatar, pm_email_notify, karmaBad, karmaGood, usertitle, notifyAnnouncements, notifyOnce, notifySendBody, notifyTypes, memberIP,    memberIP2, secretQuestion, secretAnswer, ID_THEME, is_activated, validation_code, ID_MSG_LAST_VISIT, additionalGroups, smileySet, ID_POST_GROUP, totalTimeLoggedIn, passwordSalt FROM source_members;


UPDATE source_categories SET ID_CAT=ID_CAT+ncategories;

INSERT INTO destination_categories (ID_CAT, catOrder, name, canCollapse) SELECT ID_CAT, catOrder, name, canCollapse FROM source_categories;


UPDATE source_boards SET ID_BOARD=ID_BOARD+nboards;
UPDATE source_boards SET ID_CAT=ID_CAT+ncategories;
UPDATE source_boards SET ID_PARENT=ID_PARENT+nboards WHERE ID_PARENT > '0';
UPDATE source_boards SET ID_LAST_MSG=ID_LAST_MSG+nmessages;
UPDATE source_boards SET ID_MSG_UPDATED=ID_MSG_UPDATED+nmessages;

INSERT INTO destination_boards  (ID_BOARD, ID_CAT, childLevel, ID_PARENT, boardOrder, ID_LAST_MSG, ID_MSG_UPDATED, memberGroups, name, description, numTopics, numPosts, countPosts, ID_THEME, permission_mode, override_theme)
SELECT ID_BOARD, ID_CAT, childLevel, ID_PARENT, boardOrder, ID_LAST_MSG, ID_MSG_UPDATED, memberGroups, name, description, numTopics, numPosts, countPosts, ID_THEME, permission_mode, override_theme FROM source_boards;


UPDATE source_topics SET ID_TOPIC=ID_TOPIC+ntopics;
UPDATE source_topics SET ID_BOARD=ID_BOARD+nboards;
UPDATE source_topics SET ID_FIRST_MSG=ID_FIRST_MSG+nmessages;
UPDATE source_topics SET ID_LAST_MSG=ID_LAST_MSG+nmessages;
UPDATE source_topics SET ID_MEMBER_STARTED=ID_MEMBER_STARTED+nmembers;
UPDATE source_topics SET ID_MEMBER_UPDATED=ID_MEMBER_UPDATED+nmembers;
UPDATE source_topics SET ID_POLL=ID_POLL+npolls WHERE ID_POLL > '0';

INSERT INTO destination_topics (ID_TOPIC, isSticky, ID_BOARD, ID_FIRST_MSG, ID_LAST_MSG, ID_MEMBER_STARTED, ID_MEMBER_UPDATED, ID_POLL, numReplies, numViews, locked) SELECT ID_TOPIC, isSticky, ID_BOARD, ID_FIRST_MSG, ID_LAST_MSG, ID_MEMBER_STARTED, ID_MEMBER_UPDATED, ID_POLL, numReplies, numViews, locked FROM source_topics;


UPDATE source_polls SET ID_POLL=ID_POLL+npolls WHERE ID_POLL > '0';
UPDATE source_polls SET ID_MEMBER=ID_MEMBER+nmembers;

INSERT INTO destination_polls (ID_POLL, question, votingLocked, maxVotes, expireTime, hideResults, changeVote, ID_MEMBER, posterName)
SELECT ID_POLL, question, votingLocked, maxVotes, expireTime, hideResults, changeVote, ID_MEMBER, posterName FROM source_polls;


UPDATE source_poll_choices SET ID_POLL=ID_POLL+npolls;

INSERT INTO destination_poll_choices (ID_POLL, ID_CHOICE, label, votes)
SELECT ID_POLL, ID_CHOICE, label, votes FROM source_poll_choices;


UPDATE source_messages SET ID_TOPIC=ID_TOPIC+ntopics;
UPDATE source_messages SET ID_BOARD=ID_BOARD+nboards;
UPDATE source_messages SET ID_MSG=ID_MSG+nmessages;
UPDATE source_messages SET ID_MEMBER=ID_MEMBER+nmembers;
UPDATE source_messages SET ID_MSG_MODIFIED=ID_MSG_MODIFIED+nmembers;

INSERT INTO destination_messages (ID_MSG, ID_TOPIC, ID_BOARD, posterTime, ID_MEMBER, ID_MSG_MODIFIED, subject, posterName, posterEmail, posterIP, smileysEnabled, modifiedTime, modifiedName, body, icon)
SELECT ID_MSG, ID_TOPIC, ID_BOARD, posterTime, ID_MEMBER, ID_MSG_MODIFIED, subject, posterName, posterEmail, posterIP, smileysEnabled, modifiedTime, modifiedName, body, icon FROM source_messages;


UPDATE source_calendar SET ID_EVENT=ID_EVENT+nevents;
UPDATE source_calendar SET ID_BOARD=ID_BOARD+nboards;
UPDATE source_calendar SET ID_TOPIC=ID_TOPIC+ntopics;
UPDATE source_calendar SET ID_MEMBER=ID_MEMBER+nmembers;

INSERT INTO destination_calendar (ID_EVENT, startDate, endDate, ID_BOARD, ID_TOPIC, title, ID_MEMBER)
SELECT ID_EVENT, startDate, endDate, ID_BOARD, ID_TOPIC, title, ID_MEMBER FROM source_calendar;
The only forum for students in North and South Dakota!  Campus Dakota.

perplexed

hey mburtonk

I have a question I hope you can answer.

I have two forums that I would like to merge together.  The second forum was created from a copy of the first forum, so the members are identical and have identical member numbers

ie member 50 on forum one is the same member as member 50 on forum 2 etc and that applies to all members on both forums.

What I want is to merge the posts from the smaller second forum into the larger first forum and allocate the posts to the correct member. 

These scripts sound ok when you have totally different members in each forum but what about in a case like this when the members are identical on both?

Thanks

dakotaroar

It depends on what you want to deal with--the method I used kept all the users, but some of them were converted to guest posters.  If you don't care about post counts and just want the posts retained, this will work.  However if you want the users able to be linked to all of their posts, I think you would need to do something to the member table in order to assure that members of the same name ended up with the same id number.

Example:  http://talk.campusdakota.com/index.php?topic=709.0 vs. http://talk.campusdakota.com/index.php?topic=2194.0 and see how 'mburtonk' is displayed.
The only forum for students in North and South Dakota!  Campus Dakota.

perplexed

ah I see, no that wouldnt do, the members would want all their posts allocated to them as they are basically the same person.

I wonder is it possible to just import posts from forum 2 and then allocate them to the correct users on forum 1?

I know nothing about how the database stuff works :(

dakotaroar

The authorid is in the entries table, so what one could do is write a script that would run through the authorid column in the entries table (in your source forum) and replace each authorid with the matching authorid from the destination forum.  For example:

Membername   sourceauthorid   destinationauthorid
member1                2                             23
member2                3                             24
member3                4                             25
.
.
.
Then you would replace 2 with 23, 3 with 24, 4 with 25, etc.

That's about all I can take the time to write at the moment, but I will try to come back and put something together for you when I get a chance.
The only forum for students in North and South Dakota!  Campus Dakota.

Lenophis

What about merging just the posts? Would it be possible to remove the users of board B from the equation so the current userbase is intact, but just has all of the posts from both boards? (Feel free to mash my head in if that question makes no sense, I'm having a hard time clarifying...)

dakotaroar

In that case you would have posts with no authors, is that what you want?

Theoretically I think it would work, although I've never tried it.
The only forum for students in North and South Dakota!  Campus Dakota.

Eliana Tamerin

SMF 2.0 has a function to reattribute posts to the authors (or to any member) if their membership has been terminated and since re-established. I know [SiNaN] also coded a mod that does the same thing, but on a per-post basis. You could use either one of those things in a script to automate the process for a large group of members.
Do NOT PM me for support.

SimplePortal 2.3.6 is OUT!
SimplePortal Project Manager
Download | Docs
SimplePortal: Power of Simplicity!

dakotaroar

The only forum for students in North and South Dakota!  Campus Dakota.

MICHICAUST

Hi all!

I have a problem very similar to quiteperplexed's, twist is: I have a rather large forum which crashes the SQL server from time to time.

After running out of options I have just made a backup of the _messages table (which contains >700 MiB of data according to phpmyadmin), and because the SQL server always hangs wit a "Checking Table" process on the _messages table I have, in my naivety ;), deleted all postings older than 120 days in the SMF admin area.

The Forum ran afterwards (read: it didn't crash immediately again after calling up the index.php) but I restored the previously backed-up -messages table in order to get all the messages back - or so I thought...

Well, the table WAS restored, but the messages weren't shown anymore, which I only realized after two days, not even after letting SMF recalculate all the values etc.

Now I can restore a backup of the whole database from before the postings-pruning, but then the two days' worth of NEW postings, PMs, attachments etc. will be lost, which I do not want at all.

I _have_ a backup of the current status (forum's offline now), so my question is (tadaaa!):

Will I be able somehow to insert the last two days' of postings etc. into the "old" version of the DB from before the pruning so I have all the old postings restored _AND_ the new ones added to them, too?

Can I maybe just cut&paste the last x lines from the .sql file into the backup of the old version and restore the running DB from that, with all the postings counted, associated with their users etc? Were the postings' counters just increased by +1 even after the pruning, because then it would work, no?



I hope I didn't write TOO confusing now... ;)


Best regards,

- Michael

dakotaroar

Quote from: MichaelWill I be able somehow to insert the last two days' of postings etc. into the "old" version of the DB from before the pruning so I have all the old postings restored _AND_ the new ones added to them, too?
This should be possible if you just use the SQL syntax in the script above that relates to the messages and the topics.  Since you are adding posts that are OLDER than the posts already in the table, I don't think there should be a problem with overlapping message/topic ids.


QuoteCan I maybe just cut&paste the last x lines from the .sql file into the backup of the old version and restore the running DB from that, with all the postings counted, associated with their users etc? Were the postings' counters just increased by +1 even after the pruning, because then it would work, no?

I don't know about copying and pasting into the .sql file, but I think that if you add the missing posts back in and then run "recalculate statistics" (or whatever it is), it should count all the posts for each user again so you have the right number.  Did removing all the older posts make post counts drop?

Sorry I'm so late with this, it's been busy.
The only forum for students in North and South Dakota!  Campus Dakota.

MICHICAUST


Hi dakotaroar,

THX for your reply! :)

I have - some time ago, now - really done it this way:

1. Took the old backup, opened the .sql file on my PC with a text editor
2. Also opened the NEW backup file on my PC ( note: do _NOT_ do this if you want to stay sane through the process, on my machine with a Q6600 >3,2 GHz and 4 GB of RAM even this made the thing sweat)
3. Just copied the new file's contents to the old one, section per section - only the "INSERT TABLE" lines, though! ( see note above -> more, and REAL, insanity ensues!! ;) )
4. Uploaded the whole thing to the server and restored from that file.
5. Then "recalculated the statistics", just like dakotaroar wrote, and everything was fine again!!

Regards,

- Michael

dakotaroar

Excellent!  Glad you got everything to work!
The only forum for students in North and South Dakota!  Campus Dakota.

jummijammi

#17
I have searched for comments about merging separate forums and 99% of them suggest that it is too complex to be even tried. The only member who said that it doable and also has done it was dakotaroar. Big thanks to him for encouraging others that it's possible!

I can't get the job done by using his sql –script as I have 5 big forums using the SMF 2 and the script is for SMD 1.

I need to merge practically all meaning also personal messages, attachments and all users, not only the ones with different names.

So I'm writing a java program to do that merging job. It will be able to deal with the problematic of having the same username or e-mail address in multiple forums and also renumbering/renaming filenames of the attachments.

Anyone doing something similar for merging?

dakotaroar

Good luck, jummijammi!  A script as you are suggesting would be much appreciated by everyone, I am sure.  I've not had to merge any forums since I wrote that script, so I have not even tried to apply it to 2.0.

Nobody seems to be working on the same thing, as far as I am aware, although you've probably read that the official SMF stance is "we're working on it".  I'm not the only one to get mileage out of that script (a few people have emailed me and gotten it to work), but a script with a GUI, options, and the ability to deal with problematic duplicates would be excellent.
The only forum for students in North and South Dakota!  Campus Dakota.

jummijammix

I made a java program that connects to two databases using jdbc so there is no need to have those forums in the same database or even in the same server.

The first version does the basic merge
- All the tables mentioned below are merged
- All the members, also duplicates, are transferred

For the next version I need to plan rules for merging members those are obviously the same. I have been thinking a rule like: "If login and email match then the two members must be merged into one". Merging of members means of course merging the related information like personal messages, posts etc. I have just noted that not everyone uses the same password for both forums and that might be a problem for them after the merge.

For the attachment files I was thinking of producing automatically a simple script that basically just contains operating system commands for renaming the related files. I don't see a need to make the automated ftp transfer from one server to another reasonable as the volume for those files is huge as they are typically images. Attachments file transfer is more easily done manually by transferring them as one big gzipped file.


Here is a list of tables found necessary to merge. The ones that added to the list of dakotaroar are maked by  (+).

Attachments and Personal messages are merged of course for not losing them. Ban tables are needed for not giving a second opportunity for already banned spammers. Two log tables are needed for having the information what the logged in member has already read.

attachment  (+)
ban_groups  (+)
ban_items  (+)
boards
calendar
categories
log_mark_read  (+)
log_topics  (+)
members
messages
personal_msgs  (+)
pm_recipients  (+)
poll_choices
polls
topics

Below are the formulas for changing the id –fields for the additional tables. For the others I used the formulas presented in the script written by dakotaroar

attachment.id_attach = attachment.id_attach + attachment.last_id
attachment.id_thumb = attachment.id_thumb + attachment.last_id
attachment.id_msg = attachment.id_msg + messages.last_id
attachment.id_member = attachment.id_member + members.last_id

banGroup.id_ban_group = banGroup.id_ban_group + ban_groups.last_id

banItem.id_ban = banItem.id_ban + ban_items.last_id
banItem.id_ban_group = banItem.id_ban_group + ban_groups.last_id
banItem.id_member = banItem.id_member + members.last_id

log_mark_read.setId_member = log_mark_read.getId_member + members.last_id
log_mark_read.setId_board = log_mark_read.getId_board + boards.last_id
log_mark_read.setId_msg = log_mark_read.getId_msg + messages.last_id

log_topic.id_member = logTopic.id_member + members.last_id
log_topic.id_topic = logTopic.id_topic + topics.last_id
log_topic.id_msg = logTopic.id_msg + messages.last_id

personalMessage.id_pm = personalMessage.id_pm + personal_msgs.last_id
personalMessage.id_pm_head = personalMessage.id_pm_head + personal_msgs.last_id
personalMessage.id_member_from = personalMessage.id_member_from + members.last_id

pmRecipient.id_pm = pmRecipient.id_pm + personal_msgs.last_id
pmRecipient.id_member = pmRecipient.id_member + members.last_id


JummiJammix

Advertisement: