As other web developers in China (and other non-latin language locations) probably know, UTF-8 is something that we need for Chinese to work correctly, however despite asking for UTF8 in the table structure, it doesn’t mean that MySQL or PHP is going to accede to our wishes.
Here is how we at Computer Solutions cope with forcing MySQL to UTF-8, despite its unwillingness!
First off, we need to configure the MySQL config via the my.cnf file to force the defaults to utf-8, instead of latin1.
Most systems my.cnf should be in either /etc/my.cnf or /etc/mysql/my.cnf (assuming *nix)
Make sure the following lines are added under mysqld:
[mysqld] default-character-set=utf8 character-set-server=utf8 collation-server=utf8_general_ci
Restart mysql (debian or similar init.d should be /etc/init.d/mysql restart)
Lets check that worked -
mysql -u root -p show variables like "%character%";
If you see results like this:
show variables like "%character%";+--------------------------+----------------------------+| Variable_name | Value |+--------------------------+----------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | utf8 || character_set_filesystem | binary || character_set_results | utf8 || character_set_server | utf8 || character_set_system | utf8 || character_sets_dir | /usr/share/mysql/charsets/ |+--------------------------+----------------------------+8 rows in set (0.00 sec)
Then it worked. If it doesn’t and you see something similar to below
mysql> show variables like "%character%";+--------------------------+----------------------------+| Variable_name | Value |+--------------------------+----------------------------+| character_set_client | latin1 || character_set_connection | latin1 || character_set_database | utf8 || character_set_filesystem | binary || character_set_results | latin1 || character_set_server | utf8 || character_set_system | utf8 || character_sets_dir | /usr/share/mysql/charsets/ |+--------------------------+----------------------------+8 rows in set (0.00 sec)
Then we need to re-edit /etc/my.cnf
Add this under the [mysqld] lines we edited earlier.
skip-character-set-client-handshake
then restart mysql (/etc/init.d/mysql restart ) and recheck.
That forces MySQL to *use* what we <expletive deleted> well told it too!
That unfortunately isn’t the end of the story. If like us, you’ve been merrily using UTF-8 data inside Latin1 encoded tables accidentally, you’ll still need to export your data in the correct format, and reimport it.
This can be done as follows:
Use MySQLDump with an explicit encoding passed to it:
<span class="ident">mysqldump</span> <span class="punct">--</span><span class="ident">user</span><span class="punct">=</span><span class="ident">username</span> <span class="punct">--</span><span class="ident">password</span><span class="punct">=</span><span class="ident">password</span> <span class="punct">--</span><span class="ident">default</span><span class="punct">-</span><span class="ident">character</span><span class="punct">-</span><span class="ident">set</span><span class="punct">=</span><span class="ident">latin1 \ </span> <span class="punct">--</span><span class="ident">skip</span><span class="punct">-</span><span class="ident">set</span><span class="punct">-</span><span class="ident">charset</span> <span class="ident">myscrewedupdb</span> <span class="punct">></span> <span class="ident">dump</span><span class="punct">.</span><span class="ident">sql</span>
This will give us a dump of the data in the correct format, albeit with the wrong encoding specified in the file. You’ll need to edit the file to change latin1 to utf8 eg
<span class="ident">sed</span> <span class="punct">-</span><span class="ident">i</span> <span class="punct">"</span><span class="punct">"</span> <span class="punct">'</span><span class="string">s/latin1/utf8/g</span><span class="punct">'</span> <span class="ident">dump</span><span class="punct">.</span><span class="ident">sql</span>
..and then re-import the data (I *strongly* suggest you do this to a NEW db for testing, as you DON’T want to lose data now, do you!)
<span class="ident">mysql</span> <span class="punct">--</span><span class="ident">user</span><span class="punct">=</span><span class="ident">username</span> <span class="punct">--</span><span class="ident">password</span><span class="punct">=</span><span class="ident">password</span> <span class="punct">--</span><span class="ident">default</span><span class="punct">-</span><span class="ident">character</span><span class="punct">-</span><span class="ident">set</span><span class="punct">=</span><span class="ident">utf8</span> \ <span class="ident">newtestdbname</span> <span class="punct"><</span> <span class="ident">dump</span><span class="punct">.</span><span class="ident">sql</span>
With a bit of luck, your data should be intact, and you should now be basking in the glory of UTF-8′ness, ready for the 20th Century etc!
2 Comments to “UTF-8, MySQL woes and how to force it to work!”
Post comment
Archives
- December 2011
- November 2011
- October 2011
- September 2011
- July 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
Categories
- Apple
- Badges
- BMW
- China Related
- Cool Hunting
- Exploits
- Firmware
- Food
- General Talk
- government
- IP Cam
- iPhone
- Lasers
- legislation
- MODx
- MySQL
- requirements
- Service Issues
- Tao Bao
- Technical Mumbo Jumbo
- Things that will get me censored
- Travel
- Uncategorized
- Useful Info
Most Popular Posts
- Samsung N310 (Samsung Go) Hackintosh Installation on Snow Leopard (9996)
- BMW Keys and Transponders E36 E38 E46 etc (EWS2) (7283)
- eFrontWPI - Wordpress integration Plugin for eFront (7148)
- IPCam Hacking - pt#2 (6539)
- IP Cam Hacking – pt#5 (4410)
Tags
Recent Comments
-
tryphon: It helped me to fix mine. I used a pair of pliers like you did and it worked fine. I drink a coffee typing...
-
mark: I have a ms10105 v4.1 moshisoft board and here is the pinout: 1 y stepper a (yellow) 2 y stepper a (white) 3...
-
Lawrence Sheed: Haven’t taken a deep look yet, probably next month can check it out. There are people who are...
-
mark: Yes…that moshi software is crap. I used the corel draw plugin for awhile but it only works about 20% of...
-
Kunlun: I tried to get my motorbike lesson after my car driving lesson, they answered me that I needed to wait 1...
Recent Trackbacks
- SISTEMAS O.R.P: Recuperar una cámara Zaapa CIP-RW después de un fallo de actualización
- Blog - DO Bots: Brookstone Rover AC13
- How can I stop Pop3 Brute Force attacks: need to create a regex, and add it to fail2ban Here is a guide....
- shanghailoz is our latest member! -:
- Probleem bij installatie op Samsung N310:



ICP Permit
Webmail
Newsletters
Support

Just to let you know, I’ve been reading your blog since we arrived to Shanghai last August. You actually helped out with a Mac issue on Shanghai Expat (thanks!) and I discovered your blog.
Keep up the great work. I like the easy going style of your writing and find the eclectic mix of topic very interesting (and often odd for a blog entitled “Computer Solutions”), but that’s what I find interesting. From making biltong to providing the “crib notes” to the driver’s test to search engine optimization, I think it’s great.
I will mention that this UTF8 post has been here for a while (hint, hint). I’m waiting for the next post. What will it be? Top 10 Sushi restaurants in Shanghai? How to get a Metro membership card? Cleanest bathrooms in Puxi? Guide to taxis? I’m waiting with anticipation…
Your fan in Jin Qiao.
Hi Richard,
I have about 1 billion projects I’m involved in, as well as normal office day to day stuff. While I intend to update the blog at least twice a month, I don’t always get a chance to!
Don’t worry, I’ll keep being eclectic.
Hopefully the glut of posts for this month will keep you satisfied for a while