While working on a new multilingual project, a need arose to convert my current vBulletin database to UTF-8. I thought the process would be relatively simple, but it turned out to be a nightmare that I struggled with three days. And yes, I did try the “vCharset Converter” module, which is unmaintained, sucks bad and never works. I ended up successfully converting the database to UTF-8 and everything so far has been working great, so I decided to post a quick how-to for those who are planning to convert their vBulletin forums. My previous database charset was “latin1/latin1_swedish_ci” (MySQL default) and I was using “windows-1251″ codepage to accomodate Cyrillic characters in my forum. Here is the procedure:
- Go to vBulletin admincp, then turn your board off.
- Backup your database. I do it twice – once by stopping MySQL and copying the files to a different drive and by exporting the database via mysqldump.
- Dump your current vBulletin database without binary attachment/image tables. The reason why binary tables need to be excluded, is because you will be converting all of the current data to UTF-8 and you do not want binary data changed (images will not work). Don’t forget to add other tables by adding “-ignore-table database.table” that might have binary data into the following command (in a single line):
mysqldump -u root -p --opt --default-character-set=latin1 --ignore-table=vbulletin_db.customavatar --ignore-table=vbulletin_db.customprofilepic --ignore-table=vbulletin_db.picture --ignore-table=vbulletin_db.socialgroupicon --ignore-table=vbulletin_db.socialgrouppicture vbulletin_db > vbulletin_db.sql
mysqldump -u root -p --opt --default-character-set=latin1 vbulletin_db customavatar customprofilepic picture sigpic socialgroupicon socialgrouppicture > vbulletin_db_images.sql
The first command extracts vBulletin data without tables “customavatar”, “customprofilepic”, “picture”, “sigpic”, “socialgroupicon” and “socialgrouppicture”, while the second command extracts these specific tables into a separate file. We need two separate files, because we need to convert non-binary data into UTF-8 and the binary data needs to stay the same. Replace “vbulletin_db” with the name of your database.
- Convert the codepage from your source codepage to UTF-8:
iconv -c -f windows-1251 -t utf-8 vbulletin_db.sql > vbulletin_db_utf8.sql
- Open up both files (vbulletin_db.sql and vbulletin_db_utf8.sql) in a good text editor such as notepad2 (regular text editors do not support large files) and look for “CHARSET=latin1″ and replace all occurrences with “CHARSET=utf8″. Then look for “CHARACTER SET latin1 COLLATE latin1_bin” and replace all with “CHARACTER SET utf8 COLLATE utf8_bin”. Save the files.
- Drop your existing database and create a new database with UTF-8 charset set as default:
mysql -u root -p DROP DATABASE vbulletin_db; CREATE DATABASE vbulletin_db CHARACTER SET utf8 COLLATE utf8_general_ci;
- Dump it back into MySQL:
- If you did not see any errors, all of the data should now be there.
- Open up admincp and go to “Languages and Phrases” -> “Language Manager”. Open “Edit settings” and change your installed languages’ “HTML Character Set” to “utf-8″. Save and close. By the way, if letters look strange and screwed up, don’t worry, we will get this fixed in the next step.
- Upload the “install” folder into your forum folder that has the latest files in it. Open up your browser and go to http://yourforumdomain.com/install/upgrade.php and go through all of the screens. Once you are done, remove the install folder.
mysql -u root -p --max_allowed_packet=1024M < vbulletin_db.sql
Done :) I successfully converted this forum with over a million posts and 45 thousand users.
hello
thanks on first
I have problem when I did first ssh command
I get this error
Illegal use of option –ignore-table=.
any idea how fix it ?
thanks again
3walim: I apologize for the mistake in the command line. The correct command is –ignore-table=database.table. I already fixed the command above, so please use the updated instructions and let me know how it goes.
Nasim
Great complete instruction. I searching for 3 years.
How about pictures ?
# mysql -u root -p –max_allowed_packet=1024M < vbulletin_db_images.sql
# mysqldump -u root -p –opt –default-character-set=latin1 vbulletin_db customavatar customprofilepic picture sigpic socialgroupicon socialgrouppicture > vbulletin_db_images.sql
These didn’t work
hello
thanks for fix command
still have problem
I log to my server by putty
and after I did first command it ask me pasword
I write the password of database user name but it give me this error
mysqldump: Got error: 1045: Access denied for user ‘root’@'localhost’ (using password: YES) when trying to connect
i try with the password of root
abd get same error
any idea how can I fix it?
thanks so much
hello
I fix that by remove -p
will now I’m on step 4 and confused about what you mean by
vbulletin_db.sql
the one which we have after first command “backup without image”
or the one on the first of all?
also on step 5 you suggest using notepad
will my database size 763 MB :(
is there any another way ?
natong: Please be specific on what didn’t work.
3walim: Make sure that you are putting the whole command into the same line. Do NOT attempt to copy-paste the commands into separate lines.
iconv: unable to allocate buffer for input: Cannot allocate memory
I have this error
any idea how fix it?
thanks so much
3walim: Looks like you do not have enough RAM on your machine. Try to decrease the size of the buffer by changing “–max_allowed_packet=1024M” to something like “–max_allowed_packet=256M”
Hi MegaZ,
The IT company hosting my site recently did a move to a new server and now my forum looks funny, with weird symbols appearing in between those Chinese words.
I was thinking it might be the charset problem… Can I actually fix the problem by myself? Or must I get the IT company to do it for me (which would take probably forever)?
I installed the Vbulletin by myself and have access to my site phpAdmin. Any luck of me fixing it myself?
Sorry for the trouble and thanks for taking some time to read this.
Regards, Joey
Joey, somehow I missed your comment – sorry about that!
Check out the charset option in your vBulletin style options. It should say “UTF-8″. If it does say UTF-8 and all the characters are not being displayed correctly, then you will need to get SSH access and follow the above instructions. I would recommend NOT to do this by yourself and ask your IT guys to do it for you.
First of all , thank you very much , i tried vbcharset for 3 days and dosnot work and iam very glad to see your blog throw google
what i understand from you
1- backup databse using mysqldumper then dump data to another drive without binary tablets by using your root orders
” can i copy databse to the same drive using phpmyadmin and delete binary images”???
2-convert the codepage from your source codepage to UTF-8:
# iconv -c -f windows-1251 -t utf-8 vbulletin_db.sql > vbulletin_db_utf8.sql
3-open up both files (vbulletin_db.sql and vbulletin_db_utf8.sql) in a good text editor such as notepad2,, and change chars
so i have to download my databse 200mg * 2 files then reupload them to my site again??
4-Drop your existing database and create a new database with UTF-8 charset set as default with root !
why root and i can do it in phpmyadmin! and i drop the 2 databses or only one
5-dump databse using root
6- edit lang setting from admincp
is that right? please i need more explanation
” if you have donation link added it in your blog”
thanks
Hema,
“can i copy databse to the same drive using phpmyadmin and delete binary images”???
No, you do not want to delete binary images. The only reason why I export binary image files separate from the rest of the content, is because you do not want to be converting your binary data to UTF-8 – it will get corrupted. Export exactly the way I did it in the instructions above and everything should work perfectly.
“so i have to download my databse 200mg * 2 files then reupload them to my site again??”
Yes, you will have to, because you will be modifying the files. If I have some time, I will also provide instructions for linux boxes where you do not have to edit files manually (maybe through sed or some other tool), but for now, please do exactly like it says above.
“why root and i can do it in phpmyadmin! and i drop the 2 databses or only one”
I don’t know how permissions on your database are set up, but most of the time you will need root access to be able to drop and create databases. If you cannot drop a database and re-create it, request to do it for you from your hosting provider.
As far as a donation box – you can use the donation bar on http://forum.arbuz.com and your money will go to charity.
Please let me know if you have any other questions.
Thanks Megaz
“Gazak Allah Kol Khayr” We are Muslims :(
Now I have converted my database using Notbad2 ,editing every sinlge table thin convert table to utf-8 with notepad2
Spending 5 hours editing my tables ,but it was 100% successfull
Thanks
I don’t see your donation link in forum navbar
Hope to see direct link
Salam Alikoum
Va Aleykum Assalam Hema!
You are more than welcome. Masha’Allah! I’m always glad to help Muslim brothers. Sorry, I forgot that registration is required on forum.arbuz.com to see the donation box.
I posted the donation link in the above article.
Sincerely,
Nasim
I have an error when I imported the DB back.
ERROR 1136 (21S01) at line 179: Column count doesn’t match value count at row 1959
I am using MySQL 4.1.22
Please help.
Thank you.
John: The error “Column count doesn’t match value count” means that you are feeding either more or less data into your tables than the table can support. What I would suggest, is to open up the file, find the row 1959 and take a look at what’s going on. Another problem might be that the line is simply getting cut off – you might have wrapping turned on inside the editor.
Hello there and thank you for a very helpful guide.
I am in a very strange situation here and you might be able to help me. I ‘ve used a tool from Vb directly that was supposed to change my db to UTF8. Although it did change the character_set it didn’t change anything else. The actual language we use is Greek and that’s the info that i can get for the database (i know it’s chaos..):
character_set_client latin1
character_set_connection latin1
character_set_database latin1
character_set_filesystem binary
character_set_results latin1
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
collation_connection latin1_swedish_ci
collation_database latin1_swedish_ci
collation_server latin1_swedish_ci
On the dump i can see:
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES latin1 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE=’+00:00′ */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE=’NO_AUTO_VALUE_ON_ZERO’ */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;
and i also have in several places inside written:
`title` varchar(50) collate latin1_general_ci NOT NULL
Can you help me by any chance sort this mess out?
Thanks in advance,
Konstantinos
Va Aleykum Assalam Hema!
You are more than welcome. Masha’Allah! I’m always glad to help Muslim brothers. Sorry, I forgot that registration is required on forum.arbuz.com to see the donation box.
Salam Alikoum
Thank you for this great article it helped me a lot … and I think you forgot “attachment” table that have binary data too,
And I faced problem with it.
Shokran Jazilan
thanks mate for the tips, really helpful.
i went through all the steps except for 4. (Converting the codepage from your sourcepage to UTF-8.
i didn’t know what’s iconv? where and how to use it?
please excuse my ignorance as i’m new to mysql and whole thing.
please help and give me an answer as soon as possible. please
xoo339: you can get iconv for any OS from here: http://gnuwin32.sourceforge.net/packages/libiconv.htm
I personally use it on linux, but you can also get a windows binary that does the same thing.
Thanks Nasim for the fast reply.
Now that I got iconv for windows, I have no idea on how to use it.
Do I have the cmd command line or what? how to do that?
OK, I’ve installed it then went to it’s directory and found iconv.exe ina: “C:\Program Files\GnuWin32\bin”, when I click on it a command line window pups up. I typed on it the command you mentioned: (# iconv -c -f windows-1251 -t utf-8 vbulletin_db.sql > vbulletin_db_utf8.sql) but nothing happens. I typed this command in several ways like: (iconv -c -f windows-1251 -t utf-8 vbulletin_db.sql > vbulletin_db_utf8.sql), ( -c -f windows-1251 -t utf-8 vbulletin_db.sql > vbulletin_db_utf8.sql), and (-f windows-1251 -t utf-8 vbulletin_db.sql > vbulletin_db_utf8.sql), but still nothing happens.
I even changed the directory of the file vbulletin_db.sql and wrote in the command line window, but nothing happens too.
I’m running Windows 7, do you think iconv does not work on Windows 7?
xo339, you are on the right track – you need to execute iconv from DOS command prompt (just go to Start->Search and type cmd and press enter). Once the command prompt comes up, type ‘cd “C:\Program Files\GnuWin32\bin”‘ and press enter. Make sure to put your vbulletin db sql file in the directory first, then type the command as shown above. It should not return anything, but a new file should appear in the directory, called vbulletin_db_utf8.sql. That’s the file you need to import back into SQL (follow next steps).
As far as iconv not working in Windows 7, not sure – you would have to check online to see if there is a solution.
That works with running the CMD as administrator.
Going to test the conversation on my test forum.
I’ll let you know if I’ll have any trouble.
Thanks mate :)
Perfect! I’m sure everything will go smoothly, but if it doesn’t, let me know :)
You are welcome!
Hi again,
Everything went as it’s supposed to.
But a few question came to my mind:
1- Why didn’t we enable utf-8 encoding from the config.php file as I read somewhere we should do that?
2- Tables with binary data also have columns with non-binary data, is there anyway to keep only the binary columns from converting and convert the rest of the non-binary columns?
3- As I mentioned in Q.2, there are non-binary columns/data within the unconverted tables, isn’t that going to be an issue when showing those fields/data with utf8 encoding as those field still encoded with latin1? isn’t that going to mess up the content of those fields? or that is irrelevant nothing will show from those fields?
4- Why would someone want to change encoding from latin1 to utf8 when everything is working in his forum(s) and has no issue with latin1? What are the advantages of utf8 and the disadvantages of latin1?
Please excuse my ignorance. and sorry for all of my questions. I just want to understand how things work to build my knowledge the right way from such experts like yourself. :)
xo339, I’m glad things worked out. Here are the answers:
1) You should put UTF-8 encoding info into the config.php file
2) No, there is nothing you can do about tables with non-binary columns. Those would be skipped. Unless you are willing to extract specific fields from MySQL, then convert them, then assemble back the data.
3) The only case where this might be an issue, is if you have data in other languages that is written into those fields. If everything is in English within the tables that contain both binary and non-binary data, then you should be OK. My forum was mixed in Russian/English/Uzbek/Turkish languages and everything got converted without a problem.
4) Why convert? Because currently, you are putting a specific encoding on the browser-level (I was using Windows-1251). The problem, is that while one language might display correctly, other languages will not. Obviously, you cannot mix different encodings in one page. With UTF-8, everything becomes unified and standard. UTF-8 works with any language, including Arabic. Visit my forum on http://www.arbuz.com, scroll down to the very bottom of the page and change the language to Arabic. You will see that everything gets changed without a glitch and I do not have to change to Arabic encoding to display characters correctly. If I switch back to Russian, same thing happens. If I didn’t change it to UTF-8, then I would not be able to mix any languages. Switching from English to Arabic, would require me to change the encoding of the page. Anything written on any other language would not display correctly.
Hope this makes sense.
Yeah, it does make sense.
Anyway, for the first question, it works for me without setting the encoding to utf-8 in the config.php file (when I leave the line “// $config['Mysqli']['charset'] = ‘utf8′;” commented everything works just fine, but when I uncomment the line to enable it, everything gets messy and characters look strange).
In your answer number #3 you said “If everything is in English within the tables that contain both binary and non-binary data, then you should be OK.”, my forum is in Arabic (all of it), does that mean it’s not OK and might I’ll have issues with non-binary columns in the unconverted tables?
I’ve already extracted the non-binary data from the unconverted tables, and converted them to utf-8, but I couldn’t find out how to insert/put them back to those tables. I tried ‘update table set column’ but that didn’t work (maybe I’ve something wrong in the syntax or something else). I also tried ‘insert into table (column) values’, but with no success.
Any ideas on how to do that?
xoo339, if setting the encoding to utf-8 in config.php messes up your forum, then you have a problem! Did you change your vbulletin HTML character set to utf-8 as it says above? You need to change it in both places – otherwise it won’t work! But if you already have it set to utf-8 in vBulletin and setting utf-8 in config.php messes up the characters, then you definitely have a problem with your database.
As far as binary/non-binary data, it is not as simple as just running an “insert into” statement. The problem is that you need to overwrite the current row data with the new utf-8 data, keeping everything in consistent state. The only way to do this that I can think of, is to extract the current table with its primary key and the fields you want to convert into a text file, then convert the data to utf-8, then put the data back into a separate table. Then, you would need to write an insert statement with a join, so that it takes the old data from the right columns in the old table, then the new columns in the new table, then dumps everything into a new table with the same layout as the old one, matching the primary keys.
So, here is how you would do it:
1) Rename current table to table_old
2) Extract the columns that need to be converted, along with the primary key (first field) into a text file
3) Use iconv to convert the text file into utf-8
4) Insert the data back into a new temporary table
5) Create a new table with the same layout as table_old
6) Write an insert statement like: insert into new_table (select a.field1,b.field2 from temp_table as a left join table_old as b where a.primary_key = b.primary_key)
This would pull data from two different tables into a single one and all data should be preserved.
But again, the above steps are for an advanced MySQL user…
Hello
It didn’t Worked out for me..
I replaced the windows 1256 to utf8 using
#replace
command, and when I tried to insert the data back in to the database with the command:
#mysql -u root -p –max_allowed_packet=1024M < vbulletin_db.sql
it gives:
#—***////ERROR 1064 (42000) at line 7: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'utf8 @OLD_CHARACTER_utf8_CLIENT=@@CHARACTER_utf8_CLIENT */' at line 1—***////
pleas, tell me if there is anything i should do!
Thanks for ur kind help
You are a king! I still haven’t tested this, but I am very glad I found it. Will be going through it in a few days. Thank you very much.
I don’t know why but I have a problem when try to restore the new database, it says that have problems with the new DB and my user ¿?
It says that I have a problem with the permissions, but I have the permissions setting up correctly.
Any help please!!
I don’t understand why you make me convert with iconv and then I have to import the file I haven’t converted:
mysql -u root -p –max_allowed_packet=1024M < vbulletin_db.sql
Is there something I don't get in your process?
And, after doing that point I guess I'll have to import the vbulletin_db_images.sql too right?
mysql -u root -p –max_allowed_packet=1024M < vbulletin_db_images.sql
Thanks
Marco
To make the replacement with sed you can do the following:
sed -i.bak -e ‘s/CHARSET=latin1/CHARSET=utf8/g’ -e ‘s/CHARACTER SET latin1 COLLATE latin1_bin/CHARACTER SET utf8 COLLATE utf8_bin/g’ vbulletin_db_utf8.sql
sed -i.bak -e ‘s/CHARSET=latin1/CHARSET=utf8/g’ -e ‘s/CHARACTER SET latin1 COLLATE latin1_bin/CHARACTER SET utf8 COLLATE utf8_bin/g’ vbulletin_db_images.sql
all goes smooth without errors, but all i got is gibberish.
Hello
It didn’t Worked out for me..
I replaced the windows 1256 to utf8 using
#replace
command, and when I tried to insert the data back in to the database with the command:
#mysql -u root -p –max_allowed_packet=1024M < vbulletin_db.sql
it gives:
#—***////ERROR 1064 (42000) at line 7: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'utf8 @OLD_CHARACTER_utf8_CLIENT=@@CHARACTER_utf8_CLIENT */' at line 1—***////
pleas, tell me if there is anything i should do!
Thanks for ur kind help
thank you very much….
valuable article, and it assist me to solve my problems that i struggle with for 7 days nearly