如果你想要一个 UTF8 字符串,其中每个字节都是正确的('Ö' -> [195, 0] , [150, 0]),你可以使用如下:
public static string Utf16ToUtf8(string utf16String)
{
/**************************************************************
* Every .NET string will store text with the UTF16 encoding, *
* known as Encoding.Unicode. Other encodings may exist as *
* Byte-Array or incorrectly stored with the UTF16 encoding. *
* *
* UTF8 = 1 bytes per char *
* ["100" for the ansi 'd'] *
* ["206" and "186" for the russian 'κ'] *
* *
* UTF16 = 2 bytes per char *
* ["100, 0" for the ansi 'd'] *
* ["186, 3" for the russian 'κ'] *
* *
* UTF8 inside UTF16 *
* ["100, 0" for the ansi 'd'] *
* ["206, 0" and "186, 0" for the russian 'κ'] *
* *
* We can use the convert encoding function to convert an *
* UTF16 Byte-Array to an UTF8 Byte-Array. When we use UTF8 *
* encoding to string method now, we will get a UTF16 string. *
* *
* So we imitate UTF16 by filling the second byte of a char *
* with a 0 byte (binary 0) while creating the string. *
**************************************************************/
// Storage for the UTF8 string
string utf8String = String.Empty;
// Get UTF16 bytes and convert UTF16 bytes to UTF8 bytes
byte[] utf16Bytes = Encoding.Unicode.GetBytes(utf16String);
byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, utf16Bytes);
// Fill UTF8 bytes inside UTF8 string
for (int i = 0; i < utf8Bytes.Length; i++)
{
// Because char always saves 2 bytes, fill char with 0
byte[] utf8Container = new byte[2] { utf8Bytes[i], 0 };
utf8String += BitConverter.ToChar(utf8Container, 0);
}
// Return UTF8
return utf8String;
}
在我的情况下,DLL 请求也是一个 UTF8 字符串,但不幸的是,UTF8 字符串必须使用 UTF16 编码('Ö' -> [195, 0], [19, 32])进行解释。因此 ANSI '-' 即 150 必须转换为 UTF16 '-' 即 8211。如果您也有这种情况,您可以使用以下代码:
public static string Utf16ToUtf8(string utf16String)
{
// Get UTF16 bytes and convert UTF16 bytes to UTF8 bytes
byte[] utf16Bytes = Encoding.Unicode.GetBytes(utf16String);
byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, utf16Bytes);
// Return UTF8 bytes as ANSI string
return Encoding.Default.GetString(utf8Bytes);
}
或者原生方法:
[DllImport("kernel32.dll")]
private static extern Int32 WideCharToMultiByte(UInt32 CodePage, UInt32 dwFlags, [MarshalAs(UnmanagedType.LPWStr)] String lpWideCharStr, Int32 cchWideChar, [Out, MarshalAs(UnmanagedType.LPStr)] StringBuilder lpMultiByteStr, Int32 cbMultiByte, IntPtr lpDefaultChar, IntPtr lpUsedDefaultChar);
public static string Utf16ToUtf8(string utf16String)
{
Int32 iNewDataLen = WideCharToMultiByte(Convert.ToUInt32(Encoding.UTF8.CodePage), 0, utf16String, utf16String.Length, null, 0, IntPtr.Zero, IntPtr.Zero);
if (iNewDataLen > 1)
{
StringBuilder utf8String = new StringBuilder(iNewDataLen);
WideCharToMultiByte(Convert.ToUInt32(Encoding.UTF8.CodePage), 0, utf16String, -1, utf8String, utf8String.Capacity, IntPtr.Zero, IntPtr.Zero);
return utf8String.ToString();
}
else
{
return String.Empty;
}
}
如果您需要它,请参阅Utf8ToUtf16。
希望我能有所帮助。