如何使用 MultiByteToWideChar？答案

【问题标题】：How do I use MultiByteToWideChar?如何使用 MultiByteToWideChar？
【发布时间】：2011-10-05 07:12:39
【问题描述】：

我想将普通的string 转换为wstring。为此，我尝试使用 Windows API 函数MultiByteToWideChar。但这对我不起作用。

这是我所做的：

string x = "This is c++ not java";
wstring Wstring;
MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , x.size() , &Wstring , 0 );

最后一行产生编译错误：

'MultiByteToWideChar' : cannot convert parameter 5 from 'std::wstring *' to 'LPWSTR'

如何解决这个错误？

另外，参数cchWideChar 的值应该是多少？ 0可以吗？

【问题讨论】：

您不能将指向 std::wstring 的指针传递给此函数。

标签： c++ winapi visual-c++ character-encoding

【解决方案1】：

该函数不能采用指向 C++ 字符串的指针。它将需要一个指向足够大小的宽字符缓冲区的指针——你必须自己分配这个缓冲区。

string x = "This is c++ not java";
wstring Wstring;
Wstring.resize(x.size());
int c =  MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , x.size() , &Wstring[0], 0 );

【讨论】：

MultiByteToWideChar 需要 wchar_t* 类型的参数。 Wstring 的类型为std::wstring - 所以它不能传递给MultiByteToWideChar（甚至不能传递给它的指针）。但好消息是，std::wstring 在内部将其数据存储为wchar_t*，并提供两个函数来访问此内部数据：data()（此处使用）和c_str()。
@DeadMG，注意 wstring.data() 返回一个 const wchar_t*，不应该直接修改与 cplusplus.com 对应的编码（你可能比我更清楚这样做会产生什么影响）。 OTOH，MBTWC 的最后一个参数为 0，无论如何都不会在该缓冲区中放置任何内容...
@eran：哎呀，你完全正确地认为返回值是const。
这样不行，wstring 使用 32bit 字符，而 win32 使用 16bit unicode 字符...

【解决方案2】：

您必须致电MultiByteToWideChar 两次：

第一次调用MultiByteToWideChar 用于查找宽字符串所需的缓冲区大小。看Microsoft's documentation；它指出：

如果函数成功并且 cchWideChar 为 0，则返回值为 lpWideCharStr 指示的缓冲区所需的大小（以字符为单位）。

因此，要使MultiByteToWideChar 为您提供所需的大小，请将0 作为最后一个参数cchWideChar 的值传递。您还应该将NULL 传递为前面的lpWideCharStr。
使用上一步中的缓冲区大小获取足够大的非常量缓冲区以容纳宽字符串。将此缓冲区传递给对MultiByteToWideChar 的另一个调用。而这一次，最后一个参数应该是缓冲区的实际大小，而不是 0。

一个粗略的例子：

int wchars_num = MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , -1, NULL , 0 );
wchar_t* wstr = new wchar_t[wchars_num];
MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , -1, wstr , wchars_num );
// do whatever with wstr
delete[] wstr;

另外，请注意使用 -1 作为 cbMultiByte 参数。这将使生成的字符串以 null 结尾，从而使您免于处理它们。

【讨论】：

+1 用于强调需要两次调用 MultiByteToWideChar，这对于字符集转换函数是必不可少的。
@eran wchar_t* 和 LPTSTR 有什么区别？
@Suhail Gupta，如果您使用 Unicode 进行编译，那么它完全一样。在多字节构建中，LPTSTR 将扩展为常规的char*。使用这些宏允许您创建 Unicode 和非 Unicode 构建。不过，这些天我想不出这样做的理由，而且由于 Unicode 现在是 VS 中的默认设置，因此请使用其中之一。
哎呀！没有free[] 这样的东西，即使有，我也绝不会容忍这样的代码。使用适当调整大小的std::vector<wchar_t>。
@DeadMG Owch 确实......这就是为什么我说它是粗略的。很着急。固定答案，谢谢。

【解决方案3】：

关于这个的第二个问题，今天早上！

WideCharToMultiByte() 和 MultiByteToWideChar() 使用起来很痛苦。每次转换都需要对例程进行两次调用，并且您必须注意分配/释放内存并确保字符串正确终止。你需要一个包装器！

我的博客上有一个方便的 C++ 包装器，here，欢迎您使用。

这是今天早上的另一个question

【讨论】：

【解决方案4】：

您可以在下面尝试此解决方案。我测试过，它可以工作，可以检测特殊字符（例如： º ç á ），并且可以在 Windows XP、带有 SP4 的 Windows 2000 及更高版本、Windows 7、8、8.1 和 10 上运行。使用std::wstring 代替new wchar_t / delete，我们减少了泄漏资源、溢出缓冲区和损坏堆的问题。

dwFlags 设置为 MB_ERR_INVALID_CHARS 以在带有 SP4 的 Windows 2000 和更高版本的 Windows XP 上运行。如果未设置此标志，该函数会静默丢弃非法代码点。

std::wstring ConvertStringToWstring(const std::string &str)
{
    if (str.empty())
    {
        return std::wstring();
    }
    int num_chars = MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, str.c_str(), str.length(), NULL, 0);
    std::wstring wstrTo;
    if (num_chars)
    {
        wstrTo.resize(num_chars);
        if (MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, str.c_str(), str.length(), &wstrTo[0], num_chars))
        {
            return wstrTo;
        }
    }
    return std::wstring();
}

【讨论】：

【解决方案5】：

几个常见的转换：

#define WIN32_LEAN_AND_MEAN

#include <Windows.h>

#include <string>

std::string ConvertWideToANSI(const std::wstring& wstr)
{
    int count = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
    std::string str(count, 0);
    WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
    return str;
}

std::wstring ConvertAnsiToWide(const std::string& str)
{
    int count = MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), NULL, 0);
    std::wstring wstr(count, 0);
    MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), &wstr[0], count);
    return wstr;
}

std::string ConvertWideToUtf8(const std::wstring& wstr)
{
    int count = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
    std::string str(count, 0);
    WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
    return str;
}

std::wstring ConvertUtf8ToWide(const std::string& str)
{
    int count = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), NULL, 0);
    std::wstring wstr(count, 0);
    MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), &wstr[0], count);
    return wstr;
}

【讨论】：