c语言中的指针和字符串解析答案

【问题标题】：pointers and string parsing in cc语言中的指针和字符串解析
【发布时间】：2011-03-08 20:23:10
【问题描述】：

我想知道是否有人可以解释一下指针和字符串解析的工作原理。我知道我可以循环执行以下操作，但我仍然不太了解它是如何工作的。

  for (a = str;  * a;  a++) ...

例如，我试图从字符串中获取最后一个整数。如果我有一个字符串const char *str = "some string here 100 2000";

使用上面的方法，我知道最后一个整数（2000）可能会发生变化，我怎么能解析它并得到字符串的最后一个整数（2000）。

谢谢

【问题讨论】：

谢谢大家！现在一切都变得清晰了

标签： c pointers string-parsing

【解决方案1】：

您展示的循环只是遍历所有字符（字符串是指向以 0 结尾的 1 字节字符数组的指针）。对于解析，您应该使用sscanf 或更好的C++ 的字符串和字符串流。

【讨论】：

【解决方案2】：

for (a = str; * a; a++) ...

这通过在字符串的开头开始一个指针 a 来工作，直到取消引用 a 被隐式转换为 false，每一步递增 a。

基本上，您将遍历数组，直到到达字符串末尾的 NUL 终止符 (\0)，因为 NUL 终止符隐式转换为 false - 其他字符不会。

使用上面的方法，我知道最后一个整数（2000）可能会发生变化，我怎么能解析它并得到字符串的最后一个整数（2000）。

您将要查找\0 之前的最后一个空格，然后您将要调用一个函数将剩余的字符转换为整数。见strtol。

考虑这种方法：

找到字符串的结尾（使用那个循环）
向后搜索空格。
用它来呼叫strtol。

-

for (a = str; *a; a++);  // Find the end.
while (*a != ' ') a--;   // Move back to the space.
a++;  // Move one past the space.
int result = strtol(a, NULL, 10);

或者，只跟踪最后一个标记的开始：

const char* start = str;
for (a = str; *a; a++) {     // Until you hit the end of the string.
  if (*a == ' ') start = a;  // New token, reassign start.
}
int result = strtol(start, NULL, 10);

这个版本的好处是字符串中不需要空格。

【讨论】：

如果字符串不包含空格，此代码将被破坏。它会循环越过字符串的开头，可能会进入无效地址，在这种情况下它会崩溃。
如果 str 正确地以 null 终止，a = str + strlen(str) 指向字符串最后一个字节之后的字节（空字节）；我认为与for 循环几乎相同，但更具可读性；此外，您可以使用 isspace 代替 *a != ' '
@R.. ：确实如此，但考虑到问题的措辞方式，我认为这是一个安全的假设。
@ShinTakezou : 也是真的 :) 我考虑过使用strlen，但 OP 说“使用上述方法”，所以我做了...... OTOH，isspace 可能会更清楚。
@R.. ：感谢您的想法，我添加了一个不需要空格的版本。

【解决方案3】：

  for (a = str;  * a;  a++)...

等价于

  a=str;
  while(*a!='\0') //'\0' is NUL, don't confuse it with NULL which is a macro
  {
      ....
      a++;
  }

【讨论】：

【解决方案4】：

您只需要实现一个具有两种状态的简单状态机，例如

#include <ctype.h>

int num = 0; // the final int value will be contained here
int state = 0; // state == 0 == not parsing int, state == 1 == parsing int

for (i = 0; i < strlen(s); ++i)
{
    if (state == 0) // if currently in state 0, i.e. not parsing int
    {
        if (isdigit(s[i])) // if we just found the first digit character of an int
        {
            num = s[i] - '0'; // discard any old int value and start accumulating new value
            state = 1; // we are now in state 1
        }
        // otherwise do nothing and remain in state 0
    }
    else // currently in state 1, i.e. parsing int
    {
        if (isdigit(s[i])) // if this is another digit character
        {
            num = num * 10 + s[i] - '0'; // continue accumulating int
            // remain in state 1...
        }
        else // no longer parsing int
        {
            state = 0; // return to state 0
        }
    }
}

【讨论】：

Yuck :) 这需要 3 行代码，一次解析而不是分析每个字符。
这是一种低效的方法；它解析所有字符串并丢弃除最后一个之外的所有字符串。而且您应该只调用一次 strlen() 并将其保存在临时变量中，而不是像这段代码那样在每次迭代时调用它（如果字符串是 @，编译器可能会为您优化它987654322@).
@Tim/@Stephen：你熟悉过早优化这个词吗？上面的代码是为了清楚起见并说明解析器中的状态概念（即使在这种情况下只有两个状态） - OP 是一个菜鸟，需要理解基本概念，而不是担心微优化或编写尽可能简洁的代码。
@BobbyShaftoe：谁说过需要 C89/C90 兼容性？ C99 已经存在 10 多年了，大多数 C 编译器支持 C++/C99 风格的 cmets 的时间要长得多。问题是什么？它真的值得投反对票吗？？？
嗯，有人赞成。我认为这是一个好的答案。对缓存很感兴趣。

【解决方案5】：

我知道这个问题已经得到解答，但到目前为止所有的答案都是重新创建标准 C 库中可用的代码。这是我利用strrchr()

#include <string.h>
#include <stdio.h>

int main(void)
{

    const char* input = "some string here 100 2000";
    char* p;
    long l = 0;

    if(p = strrchr(input, ' '))
        l = strtol(p+1, NULL, 10);

    printf("%ld\n", l);

    return 0;
}

输出

【讨论】：