- To: Andre Pang <ozone@xxxxxxxxxxxxxxxx>
- Subject: Re: [coders] Converting a UTF-8 string to a wchar_t (in C)
- From: Martin Pool <mbp@xxxxxxxxxxxxx>
- Date: Thu, 14 Dec 2006 09:17:38 +1100
- Cc: SLUG Coders <coders@xxxxxxxxxxx>
On 14/12/2006, at 1:15am, Andre Pang wrote:
I have a C string (char*) that's encoded in UTF-8. I'd like to
convert this to a wide string (wchar_t*). I've done plenty of
reading about mbstowcs(3), iconv(3) and friends, and from what I
understand, I have two options:
1. First, setlocale() to some bogus UTF-8 locale (such as
"en_US.UTF-8", and then use mbstowcs() to perform the conversion.
2. Use the stupendously painful iconv() interface with a iconv_t
from "UTF-8" to "WCHAR_T".
3. Just write your own version of it, or copy from the standard.
Duplicating it may be kind of gross, but it's probably only ~15-20
lines and if utf-8 -> ucs4 is all you need then using a library may
not be worth it.
--
Martin