Tugger the SLUGger!SLUG Mailing List Archives

Re: [coders] Converting a UTF-8 string to a wchar_t (in C)

On 14/12/2006, at 1:15am, Andre Pang wrote:

I have a C string (char*) that's encoded in UTF-8. I'd like to convert this to a wide string (wchar_t*). I've done plenty of reading about mbstowcs(3), iconv(3) and friends, and from what I understand, I have two options:

1. First, setlocale() to some bogus UTF-8 locale (such as "en_US.UTF-8", and then use mbstowcs() to perform the conversion.

2. Use the stupendously painful iconv() interface with a iconv_t from "UTF-8" to "WCHAR_T".

3. Just write your own version of it, or copy from the standard. Duplicating it may be kind of gross, but it's probably only ~15-20 lines and if utf-8 -> ucs4 is all you need then using a library may not be worth it.