WordPress, DreamHost, UTF-8 encoding and international characters

If you work on any WordPress blogs/sites that are written in any language that has accented characters and happen to be on DreamHost, you have probably noticed character encoding problems. For instance, the new auto-save feature in WordPress or the Ajax categories add field in the category list, can display improperly encoded characters.

If you are seeing things like Actualités, read on brave shared hoster…

Your database should be set as UTF-8 (the WordPress install doesn’t specify a UTF-8 collation for the tables and fields and DreamHost uses the MySQL default of latin1_swedish_ci) for its character set and connection encoding. WordPress is set to use UTF-8 on the WordPress > Options > Reading page. Unfortunately, there is a problem with DreamHost’s setup. The MySQL client that is running on DreamHost’s servers has been set (or left at ) latin-1 encoding. This means that what you type into WordPress goes from UTF-8 text, to latin-1 text as it goes through the MySQL client and then into a UTF-8 database. This can cause a whole host of problems, especially for the Ajax components of WordPress 2.1 since the Ajax connector uses UTF-8 even if your blog is set to latin-1.

For all old content, you will need to convert it into the correct UTF-8 encoding, but for all new content, you can prevent this problem by forcing the MySQL client to use UTF-8 character encoding. Open the file wp-include/wp-db.php in your WordPress directory. Locate the following line:

$this->dbh = @mysql_connect($dbhost, $dbuser, $dbpassword);

On the next line paste this text:

$this->query("SET NAMES UTF8");

This instructs the MySQL client to use UTF-8 for data going to and from the database. Everything should now be working correctly as your UTF-8 posts go through a UTF-8 MySQL client into a UTF-8 database. It would be nice if DreamHost offered a control panel option to somehow manage the character encoding of new databases and the MySQL client.

Querying the database with “SET NAMES UTF8” immediately after opening your DB connection is also useful if you are developing your own PHP applications on DreamHost servers or using one-click installed software like phpBB, Gallery, Joomla/Mambo, MediaWiki or activeCollab.

11 thoughts on “WordPress, DreamHost, UTF-8 encoding and international characters”

  1. This didn’t work for me in WordPress 3.1… I am having the same issue on Dreamhost (with apostrophes and special characters in titles) and adding that code to similar place in the wp-db.php had no effect. Any ideas?

  2. Thank you for this post. Even though I do not use DreamHost, the issue is the same at other hosts. Thank you again.

    PS The only thing that I would add is that there is sometimes login error language which may (or may not) make your directions a little unclear. That said, if you are careful, then it is no problem.

  3. Is there a plugin to do the same? Changing the character set in mysql may seem tricky for non programmers and would be much easier to do it from within wordpress admin.

  4. Thanks. This was driving me nuts. In my case, it wasn’t the wordpress that wouldn’t take the accented character (that seems to have been fixed in the last two years), but another database I was using. I could put in the accented character and see it in phpmyadmin, I could put an accented character into a post in my wordpress database, I could display the data in a wordpress blog on another machine, but I could not see the accent in my dreamhost wordpress page.

  5. Thanks very helpful information explained well. I had to dig around the wordpress site a while ago ..

    I originally had my site on dreamhost and now moved it over to textdrive and noticed the same thing.

    It would be nice if wordpress would put this into their codebase somehow as thefantastico installer and other automated updaters will overwrite the wp-db.php file . .

  6. Pingback: Firefox

Jump into the conversation. What's on your mind?

%d bloggers like this: