计算 C 文件中注释字符和单词的程序答案

【问题标题】：program to count commented characters and words in a C file计算 C 文件中注释字符和单词的程序
【发布时间】：2020-12-18 04:43:37
【问题描述】：

我必须计算 C 文件注释中的字符和单词，包括单行 cmets 和阻塞注释。这是我所拥有的：

#include <stdio.h>

#define IN = 1
#define OUT = 0

main() {
    int c, nc;
    nc = 0;
    while ((c = getchar()) != EOF) {
        if (c == '/') {
            if (getchar() == '/')
                while (getchar() != '\n')
                    ++nc;
        }  
    }
    
    if (c == '/') {
        if (getchar() == '*')
            while (getchar() != '/')
                ++nc;
    }  
    
    printf("Character Counts: %d\n", nc);
}

它适用于每一行注释 (//)，但它会跳过被阻止的 cmets (/*...*/)。我觉得它永远不会进入被阻止评论的 if 块。非常感谢！

【问题讨论】：

不应该while (getchar() != '/') 是while (getchar() != '*')？
我是这么想的，但是如果块永远不会进入

标签： c count character

【解决方案1】：

您的代码中存在多个问题：

您必须将int 指定为main 函数的返回类型。问题中的语法已过时。

IN 和 OUT 的定义不正确。您应该使用

  #define IN   1
  #define OUT  0

或

  enum { IN = 1, OUT = 0 };

第一个循环消耗标准输入中的所有字节，您位于文件末尾，因此/*...*/ cmets 的测试不会产生任何结果。
如果在文件结尾之前未找到测试的字节，则while (getchar() != '\n') 等循环可以永远运行。

您不能单独测试// 和/*...*/ cmets，因为一个可以隐藏另一个：

  //* this is a line comment that does not start a C style one

  /* this comment contains a // but stops here */ return 0;

还请注意，您应该解析 C 字符串和字符常量，因为它们可能包含不开始注释的 // 和或 /* 序列。

对于完整的解决方案，您还应该处理转义的换行符。以下是一些病态的例子：

  // this is a line comment that extends \
     on multiple \
     lines (and tricks the colorizer)

  /\
  * this is a C comment, but the colorizer missed it *\
  /

这个问题在一般情况下解决起来并不简单，但你可以从简单的情况开始。

这是修改后的版本：

#include <stdio.h>

int main() {
    int c, cc, nc = 0;

    while ((c = getchar()) != EOF) {
        if (c == '/') {
            if ((cc = getchar()) == '/') {
                while ((c = getchar()) != '\n')
                    nc++;
            } else
            if (cc == '*') {
                while ((cc = getchar()) != EOF) {
                    if (cc == '*') {
                        if ((cc = getchar()) == '/')
                            break;
                        ungetc(cc, stdin);
                    }
                    nc++;
                }
            }
        }
    }
    printf("Character Counts: %d\n", nc);
    return 0;
}

【讨论】：

@user3121023: GCC does not complain for me. 你用的是什么开关？请注意，GCC 并不总是 C 编译器；各种开关组合可以使其行为方式不符合 C 标准。
感谢您的详细说明。我忘了提到 cmets 被认为是正确输入的。
@SanLuong：那些看起来异常的 cmets 是“正确的”，但非正统且令人困惑。显示的代码不会尝试处理影响 cmets 的反斜杠换行问题。理论上，三元组也会产生影响，尽管 C++ 已决定消除它们 - ??/ 是反斜杠的三元组，可能会影响续行。
@chqrlie，非常感谢您的帮助。 ungetc(cc, stdin) 会把刚刚读回的字符 cc 放到输入流中吗？
@SanLuong — 是的；下次调用 getchar() 时将返回“ungotten”字符。

【解决方案2】：

我添加了代码来计算单词。它在少数情况下有效，但是当我在斜线后有空格时，它的行为很奇怪。例如，//comment... 大多数情况下，字数会减少 1。

#include<stdio.h>
#define IN 1
#define OUT 0

int main() {
    int c, cc, nc = 0;
    int state;
    int nw = 0;
    state = OUT;
    
    while ((c = getchar()) != EOF) {
        if (c == '/') {
            if ((cc = getchar()) == '/') {
                while ((c = getchar()) != '\n'){
                  nc++;
                  if (c == ' ' || c == '\t')
                      state = OUT;
                   else if (state == OUT){
                      state = IN;
                      nw++;
                    }        
                }
            }      
       
            else if (cc == '*') {
                while ((cc = getchar()) != EOF) {
                if (cc == ' ' || cc == '\t')
                      state = OUT;
                   else if (state == OUT){
                      state = IN;
                      nw++;
                    }
                
                    if (cc == '*') {
                        if ((cc = getchar()) == '/')
                            break;
                        ungetc(cc, stdin);
                    }

                    nc++;
                }
            }
        }
    }
     
    printf("Character Counts: %d\n", nc);
    printf("Word Counts: %d\n", nw);
    return 0;
}

【讨论】：

【解决方案3】：

计算 C 文件中注释字符和单词的程序

它会跳过阻塞的 cmets (/.../)

我建议至少解析代码并查找 5 种状态：正常、在 // 注释中、在 /* 注释中、在 "" 字符串文字中、在 '' 字符常量中。

// pseudo code
while ((ch = getchar()) != EOF) {
   if ch == '/' and next == '/', process `//` comment until end-of-line
   elseif ch '/' and next == '*', process `/*` comment until `*/`
   elseif ch '"', process string until  " (but not \")
   elseif ch ''', process constant until  ' (but not \')
   else process normally
}

要查看 下一个 字符，请致电 getchar()，如果不符合预期，请致电 ungetc()。

【讨论】：