trouble with attach 1251->utf8

Started by anasta_pro, May 25, 2012, 08:29:04 PM

Previous topic - Next topic

anasta_pro

hello.

Please help stupid girl!

Recently I updated my smf2.0.2 on the coding utf8. After that the part of attachments at me ceased to be shown in subjects. I looked in a database and understood that investments which have no value of the field 'file_hash' and which name of files ceased to be shown is written in Russian. It is visible very old attachments which remained since I used smf 1.

for example:
(424, 1212, 2778, 0, 0, 'собор.jpg', 192412, 1838, 800, 570, '', 'jpg', 'image/jpeg', 1, 1) - can't see
424_niaidh_jpgbfc7058303e1a94534b7dce9d1f8a089 - this is filename of this file in folder 'attachments'

(13931, 13932, 93264, 0, 0, 'новый размер.JPG', 183384, 51, 1024, 768, '', 'jpg', 'image/jpeg', 1, 1) - can see
13931_iiaue_dhaciadh_JPGa001a6966de4af9305997746c352611d - this is filename of this file in folder 'attachments'

(14296, 0, 95161, 0, 3, '103.jpg_thumb', 5636, 0, 200, 150, '36fe7c48cc848d63fdc332a8d0c79c6baeb54ddd', 'jpg', 'image/jpeg', 1, 1) - can see
14296_36fe7c48cc848d63fdc332a8d0c79c6baeb54ddd  - this is filename of this file in folder 'attachments'

(1225, 0, 8120, 0, 3, '0118.jpg_thumb', 7769, 0, 200, 150, '', 'jpg', 'image/jpeg', 1, 1) - can see
1225_0118_jpg_thumb765241cd2941df4f535a1cf3bb779cd1  - this is filename of this file in folder 'attachments'

Prompt how to correct this mistake please.

excuse for my English...

MrPhil

Your description is a bit hard to understand, but if I follow you, the attachment file exists, and the problem is that the name shown on the page is incomplete? That sounds like maybe when you converted to UTF-8, the file name in the record was left in the old encoding. What does the record look like in phpMyAdmin, if that utility is displaying in UTF-8? Does the name look correct, or does it look like it was never converted?

Or, are you saying that the attachment file names are not matching up with what's in the database record in some (?) cases -- old (SMF 1, pre-conversion SMF 2) or new attachments?

anasta_pro

All my attachments are exist on the host in attachment folder, but very old attachment that have russian name of file (in database it dated since 2009 year) don't displayed in topics. And for that attachments in attach folder creating new file with other name.

For example:
original name of attachment file: собор1.jpg
name of this file in attach folder: 424_niaidh1_jpgbfc7058303e1a94534b7dce9d1f8a089
created file name then i open page with this error image: 1.jpg.temp
in db table 'smf_attachments': (424, 1212, 2778, 0, 0, 'собор1.jpg', 192412, 1838, 800, 570, '', 'jpg', 'image/jpeg', 1, 1)

i see dependence: if attachment file name have russian name and have nothing in 'smf_attachments' 'file_hash' column - this id error attachment.

this happens after i change in admin page charset from cp1251 to utf-8...

What i must to do to solve my problem?

MrPhil

Are you able to tell what character encoding the database records (smf_attachments) are in? Of interest is the original file name of the file (not the hashed file name). Can you tell if it's UTF-8 or some other encoding (still CP-1251)? Are you looking at this in phpMyAdmin, or in an .sql backup file? If your forum is displaying pages in UTF-8, the original file names (in Cyrillic) must be in UTF-8. If they aren't, for some reason the database wasn't completely converted to UTF-8. phpMyAdmin should be able to tell you what encoding the table and (if different) the file name field are in.

If you find that the smf_attachments table is still in CP-1251, you could try the following:

  • Back up either the full database, or at least the smf_attachments field. Save a copy of this backup in case you need to roll back your changes.
  • See if you can use either your hosting control panel or phpMyAdmin to change just this table (or just the field) from CP-1251 to UTF-8. I'm assuming that you used SMF's built-in conversion to UTF-8, and it failed to convert this one table. In my copy of phpMyAdmin, I select smf_attachments, go to Structure, select action = Change for "filename", for "Collation" select "utf8_general_ci" (or whatever your other tables are using), and press "Save". That should both change the field encoding and convert the data.
  • I don't think there is any need to change the other three text fields, as they should be only ASCII text anyway, but you can change them to UTF-8 if you want.

Your "real" filenames should now be in UTF-8. If something goes wrong, you can always truncate (empty) the table and import your backup for the smf_attachments table, and get back to where you are now. Let us know what you find -- it sounds like the SMF conversion to UTF-8 may be overlooking the 4 text fields in the attachments table. By the way, did you convert to UTF-8 while it was SMF 1 or SMF 2?

Advertisement: