【问题标题】:C - wcstok() wrong resultsC - wcstok() 错误结果
【发布时间】:2018-12-15 13:07:44
【问题描述】:

我的程序中有一个功能出现问题。我有一个文本,由句子组成。在每个句子中,我需要找到符号“@”、“#”、“%”并将它们更改为“(at)”、“”、“”。我正在使用 wcstok 来做这件事,因为我正在使用俄语。我遇到了以下问题。

输入:

他是一个老人,独自在墨西哥湾流中的一条小船上钓鱼,他已经走了八十四天,一条鱼也没带。在最初的四十天里,一个男孩和他在一起。但是在四十天没有鱼的情况下,男孩的父母告诉他,老人现在肯定是最后是sa@lao,这是最糟糕的不幸形式,男孩按照他们的命令乘坐另一艘船,钓到了三个好鱼#h 第一周。

输出:

他是一位独自在墨西哥湾流中的小船上钓鱼的老人,现在他已经走了八十四天了,一条鱼也没有。在最初的四十天里,一个男孩和他在一起。 B(at) (at)f(at)er for(at)y d(at)ys wi(at)ho(at) (at) fish (at)he 男孩的 p(at)ren(at)s h(at) d (at)old him (at)h(at) (at)he old m(at)n w(at)s now defini(at)ely (at)nd fin(at)lly s(at)l(at) o,这是 (at)nl(at)cky 的 (at)he wors(at) 形式,(at)nd (at)he boy h(at)d gone (at) (at)heir orders in (at) no(at)her bo(at) which c(at)gh(at) (at)hree fis(at)h (at)he first(at) week.

如您所见,它将所有字母“a”和“t”更改为“(at)”。我不明白为什么会这样。俄语字母的情况也是如此。这是负责这项工作的两个函数。

void changeSomeSymbols(Text *text) {
wchar_t atSymbol = L'@';
wchar_t atString[5] = L"(at)";
wchar_t percentSymbol = L'%';
wchar_t percentString[10] = L"<percent>";
wchar_t barsSymbol = L'#';
wchar_t barsString[10] = L"<решетка>";
for (int i = 0; i < text->textSize; i++) {
    for (int j = 0; j < text->sentences[i].sentenceSize; j++) {
        switch (text->sentences[i].symbols[j])
        {
        case L'@':
            changeSentence(&(text->sentences[i]), &atSymbol, atString);
            break;
        case L'#':
            changeSentence(&(text->sentences[i]), &barsSymbol, barsString);
            break;
        case L'%':
            changeSentence(&(text->sentences[i]), &percentSymbol, percentString);
            break;
        default:
            break;
        }
    }
}

}

void changeSentence(Sentence *sentence, wchar_t *flagSymbol, wchar_t *insertWstr) {
wchar_t *pwc;
wchar_t *newWcsentence;
wchar_t *buffer;
int insertionSize;
int tokenSize;
int newSentenceSize = 0;
insertionSize = wcslen(insertWstr);
newWcsentence = (wchar_t*)malloc(1 * sizeof(wchar_t));
newWcsentence[0] = L'\0';
pwc = wcstok(sentence->symbols, flagSymbol, &buffer);
do {
    tokenSize = wcslen(pwc);
    newWcsentence = (wchar_t*)realloc(newWcsentence, (newSentenceSize + tokenSize + 1) * sizeof(wchar_t));
    newSentenceSize += tokenSize;
    wcscat(newWcsentence, pwc);
    newWcsentence = (wchar_t*)realloc(newWcsentence, (newSentenceSize + insertionSize + 1) * sizeof(wchar_t));
    newSentenceSize += insertionSize;
    wcscat(newWcsentence, insertWstr);
    pwc = wcstok(NULL, flagSymbol, &buffer);
} while (pwc != NULL);
newSentenceSize -= insertionSize;
newWcsentence = (wchar_t*)realloc(newWcsentence, (newSentenceSize) * sizeof(wchar_t));
newWcsentence[newSentenceSize] = '\0';
free(sentence->symbols);
sentence->symbols = (wchar_t*)malloc((newSentenceSize + 1) * sizeof(wchar_t));
wcscpy(sentence->symbols, newWcsentence);
sentence->sentenceSize = newSentenceSize;
free(pwc);
free(newWcsentence);

}

【问题讨论】:

    标签: c wchar-t


    【解决方案1】:

    TextSentence 没有定义,不清楚它们应该是什么。只需在一个函数中完成。

    void realloc_and_copy(wchar_t** dst, int *dstlen, const wchar_t *src)
    {
        if(!src)
            return;
        int srclen = wcslen(src);
        *dst = realloc(*dst, (*dstlen + srclen + 1) * sizeof(wchar_t));
        if (*dstlen)
            wcscat(*dst, src);
        else
            wcscpy(*dst, src);
        *dstlen += srclen;
    }
    
    int main()
    {
        const wchar_t* src = L"He was an old man who fished alone in a skiff \
    in the Gulf Stream and he had gone eighty - four days now without tak%ing a fish.\
    In the first forty days a boy had been with him.But after forty days without a fish \
    the boy’s parents had told him that the old man was now definitely and finally sa@lao, \
    which is the worst form of unlucky, and the boy had gone at their orders in another \
    boat which caught three good fis#h the first week.";
    
        wchar_t *buf = wcsdup(src);
        wchar_t *dst = NULL;
        int dstlen = 0;
    
        wchar_t *context = NULL;
        const wchar_t* delimiter = L"@#%";
        wchar_t *token = wcstok(buf, delimiter, &context);
        while(token)
        {
            const wchar_t* modify = NULL;
            int cursor = token - buf - 1;
            if (cursor >= 0)
                switch(src[cursor])
                {
                case L'@': modify = L"(at)"; break;
                case L'%': modify = L"<percent>"; break;
                case L'#': modify = L"<решетка>"; break;
                }
    
            //append modified text
            realloc_and_copy(&dst, &dstlen, modify);
    
            //append token
            realloc_and_copy(&dst, &dstlen, token);
    
            token = wcstok(NULL, delimiter, &context);
        }
    
        wprintf(L"%s\n", dst);
    
        free(buf);
        free(dst);
    
        return 0;
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-11-03
      • 2016-06-30
      • 2018-01-18
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多