【问题标题】:c string multiple replacements within a character stringc string 一个字符串内的多个替换
【发布时间】:2009-09-29 21:30:54
【问题描述】:

假设我有一个字符串:

"(aaa and bbb or (aaa or aaa or bbb))"

**为简单起见,这将始终是字符串的格式,始终是 3 个 a 后跟一个空格或 ')' 或 3b 后跟一个空格或 ')'。

在 C 中,将每次出现的“aaa”替换为“1”以及将每次出现的“bbb”替换为“0”的最佳方法是什么。结束字符串应如下所示:

"(1 and 0 or (1 or 1 or 0))"

编辑让我更具体一点:

char* blah = (char *) malloc (8);
sprintf(blah, "%s", "aaa bbb");

blah = replace(blah);

如何编写替换,以便分配空间并存储新字符串

"1 0"

【问题讨论】:

  • 您是否对将所有出现的 src 替换为 dest 的通用解决方案感兴趣,其中 src 和 dest 的长度可能不相等?

标签: c string


【解决方案1】:

最有效的方法是使用正则表达式系列,即 POSIX。一些实现将为模式构建适当的自动机。另一种方法是重复使用 KMP 或 Boyer-Moore 搜索,但必须多次扫描字符串,效率较低。另外,如果输入这样的输入:aa=1, ab=2, bb=3 on string "aabb",你想要什么结果?

顺便说一句,当你实现这个函数时,一个更干净的解决方案是分配一个新的动态 C 字符串,而不是在替换时修改原始字符串。您可以实现就地替换,但这会复杂得多。

regex_t r; regmatch_t match[2]; int last = 0;
regcomp(&r, "(aaa|bbb)", REG_EXTENDED);
insert(hashtable, "aaa", "0"); insert(hashtable, "bbb", "1");
while (regexec(&r, oristr, 1, match, 0) != REG_NOMATCH) {
  char *val;
  strncat(newstr, oristr + last, match->rm_so);
  lookup(hashtable, oristr + match->rm_so, match->rm_eo - match->rm_so, &val);
  last = match->rm_eo;
  strncat(newstr, val);
}
strcat(newstr, oristr + last);
oristr = realloc(oristr, strlen(newstr));
strcpy(oristr, newstr); free(newstr); regfree(&r);

在实际实现中,你应该动态改变newstr的大小。您应该记录newstr 的结尾而不是使用strcat/strlen。源代码可能有问题,因为我还没有真正尝试过。但是这个想法是存在的。这是我能想到的最有效的实现方式。

【讨论】:

  • 好的,我编辑了?使其分配新空间。不确定您输入的意思:aa=1,ab=2,bb=3。
  • 我的意思是将 aa 替换为 1,ab 替换为 2,bb 替换为 3。你可能会得到“13”或“a2b”。
【解决方案2】:

对于这种特殊情况,一个简单的 while/for 循环就可以解决问题。不过貌似是作业题,所以我就不具体给大家写了。如果需要更通用的字符串操作,我会使用 pcre。

【讨论】:

    【解决方案3】:

    这里没有内存限制:

    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    /* ---------------------------------------------------------------------------
      Name       : replace - Search & replace a substring by another one. 
      Creation   : Thierry Husson, Sept 2010
      Parameters :
          str    : Big string where we search
          oldstr : Substring we are looking for
          newstr : Substring we want to replace with
          count  : Optional pointer to int (input / output value). NULL to ignore.  
                   Input:  Maximum replacements to be done. NULL or < 1 to do all.
                   Output: Number of replacements done or -1 if not enough memory.
      Returns    : Pointer to the new string or NULL if error.
      Notes      : 
         - Case sensitive - Otherwise, replace functions "strstr" by "strcasestr"
         - Always allocate memory for the result.
    --------------------------------------------------------------------------- */
    char* replace(const char *str, const char *oldstr, const char *newstr, int *count)
    {
       const char *tmp = str;
       char *result;
       int   found = 0;
       int   length, reslen;
       int   oldlen = strlen(oldstr);
       int   newlen = strlen(newstr);
       int   limit = (count != NULL && *count > 0) ? *count : -1; 
    
       tmp = str;
       while ((tmp = strstr(tmp, oldstr)) != NULL && found != limit)
          found++, tmp += oldlen;
    
       length = strlen(str) + found * (newlen - oldlen);
       if ( (result = (char *)malloc(length+1)) == NULL) {
          fprintf(stderr, "Not enough memory\n");
          found = -1;
       } else {
          tmp = str;
          limit = found; /* Countdown */
          reslen = 0; /* length of current result */ 
          /* Replace each old string found with new string  */
          while ((limit-- > 0) && (tmp = strstr(tmp, oldstr)) != NULL) {
             length = (tmp - str); /* Number of chars to keep intouched */
             strncpy(result + reslen, str, length); /* Original part keeped */ 
             strcpy(result + (reslen += length), newstr); /* Insert new string */
             reslen += newlen;
             tmp += oldlen;
             str = tmp;
          }
          strcpy(result + reslen, str); /* Copies last part and ending nul char */
       }
       if (count != NULL) *count = found;
       return result;
    }
    
    
    /* ---------------------------------------------------------------------------
       Samples
    --------------------------------------------------------------------------- */
    int main(void)
    {
       char *str, *str2;
       int rpl;
    
       /* ---------------------------------------------------------------------- */
       /* Simple sample */
       rpl = 0; /* Illimited replacements */
       str = replace("Hello World!", "World", "Canada", &rpl);
       printf("Replacements: %d\tResult: [%s]\n\n", rpl, str);
       /* Replacements: 1        Result: [Hello Canada!] */
       free(str);
    
       /* ---------------------------------------------------------------------- */
       /* Sample with dynamic memory to clean */
       rpl = 0; /* Illimited replacements */
       str = strdup("abcdef");
       if ( (str2 = replace(str, "cd", "1234", &rpl)) != NULL ) {
          free(str);
          str = str2;
       }
       printf("Replacements: %d\tResult: [%s]\n\n", rpl, str);
       /* Replacements: 1        Result: [ab1234ef] */
       free(str);
    
       /* ---------------------------------------------------------------------- */
       /* Illimited replacements - Case sensitive & Smaller result */
       str = replace("XXXHello XXXX world XX salut xxx monde!XXX", "XXX", "-",NULL);
       printf("Result: [%s]\n\n", str);
       /* Result: [-Hello -X world XX salut xxx monde!-] */
       free(str);
    
       /* ---------------------------------------------------------------------- */
       rpl = 3; /* Limited replacements */
       str = replace("AAAAAA", "A", "*", &rpl);
       printf("Replacements: %d\tResult: [%s]\n\n", rpl, str);
       /* Replacements: 3        Result: [***AAA] */
       free(str);
    
      return 0;
    }
    

    【讨论】:

      【解决方案4】:

      这绝不是世界上最优雅的解决方案,它还假设结束字符串总是比原始字符串小,哦,我对转换进行了硬编码,但希望它或多或少地指向你正确的方向或给你一个跳出的想法:

      char* replace( char *string ) {
          char *aaa = NULL;
          char *bbb = NULL;
          char *buffer = malloc( strlen( string ) );
          int length = 0;
          aaa = strstr( string, "aaa" );
          bbb = strstr( string, "bbb" );
          while ( aaa || bbb ) {
              if ( aaa && (bbb || aaa < bbb ) ) {
                  char startToHere = aaa - string;
                  strncpy( buffer, string, startToHere );
                  string += startToHere;
                  length += startToHere;
                  buffer[length] = '1';
              }
              else if ( bbb ) {
                  char startToHere = aaa - string;
                  strncpy( buffer, string, startToHere );
                  string += startToHere;
                  length += startTohere;
                  buffer[length] = '0';
              }
              aaa = strstr( string, "aaa" );
              bbb = strstr( string, "bbb" );
          }
          buffer[length] = '\0';
          string = realloc( string, length );
          strcpy( string, buffer );
          free( buffer );
      
          return string;
      }
      

      免责声明,我什至没有对此进行测试,但它至少应该朝着你想要的方向发展。

      【讨论】:

        【解决方案5】:

        这是FSM 的工作!

        #include <assert.h>
        #include <stdio.h>
        #include <string.h>
        
        /*
        //     | 0          | 1             | 2              | 3             | 4              |
        // ----+------------+---------------+----------------+---------------+----------------+
        // 'a' | 1          | 2             | ('1') 0        | ('b') 1       | ('bb') 1       |
        // 'b' | 3          | ('a') 3       | ('aa') 3       | 4             | ('0') 0        |
        // NUL | (NUL) halt | ('a'NUL) halt | ('aa'NUL) halt | ('b'NUL) halt | ('bb'NUL) halt |
        // (*) | (*) 0      | ('a'*) 0      | ('aa'*) 0      | ('b'*) 0      | ('bb'*) 0      |
        */
        
        void chg_data(char *src) {
          char *dst, ch;
          int state = 0;
          dst = src;
          for (;;) {
            ch = *src++;
            if (ch == 'a' && state == 0) {state=1;}
            else if (ch == 'a' && state == 1) {state=2;}
            else if (ch == 'a' && state == 2) {state=0; *dst++='1';}
            else if (ch == 'a' && state == 3) {state=1; *dst++='b';}
            else if (ch == 'a' && state == 4) {state=1; *dst++='b'; *dst++='b';}
            else if (ch == 'b' && state == 0) {state=3;}
            else if (ch == 'b' && state == 1) {state=3; *dst++='a';}
            else if (ch == 'b' && state == 2) {state=3; *dst++='a'; *dst++='a';}
            else if (ch == 'b' && state == 3) {state=4;}
            else if (ch == 'b' && state == 4) {state=0; *dst++='0';}
            else if (ch == '\0' && state == 0) {*dst++='\0'; break;}
            else if (ch == '\0' && state == 1) {*dst++='a'; *dst++='\0'; break;}
            else if (ch == '\0' && state == 2) {*dst++='a'; *dst++='a'; *dst++='\0'; break;}
            else if (ch == '\0' && state == 3) {*dst++='b'; *dst++='\0'; break;}
            else if (ch == '\0' && state == 4) {*dst++='b'; *dst++='b'; *dst++='\0'; break;}
            else if (state == 0) {state=0; *dst++=ch;}
            else if (state == 1) {state=0; *dst++='a'; *dst++=ch;}
            else if (state == 2) {state=0; *dst++='a'; *dst++='a'; *dst++=ch;}
            else if (state == 3) {state=0; *dst++='b'; *dst++=ch;}
            else if (state == 4) {state=0; *dst++='b'; *dst++='b'; *dst++=ch;}
            else assert(0 && "this didn't happen!");
          }
        }
        
        int main(void) {
          char data[] = "(aaa and bbb or (aaa or aaa or bbb))";
          printf("Before: %s\n", data);
          chg_data(data);
          printf(" After: %s\n", data);
          return 0;
        }
        

        【讨论】:

          【解决方案6】:

          您可以使用函数std::findstd::replace 为每个替换尝试循环。 您将找到有关 std::string here 的更多信息。

          【讨论】:

          • 这是一道 C 题,不是 C++。
          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2011-07-01
          • 2013-07-30
          相关资源
          最近更新 更多