C 比 Java 慢：为什么？答案

【问题标题】：C slower than Java: why?C 比 Java 慢：为什么？
【发布时间】：2012-02-19 12:59:51
【问题描述】：

我很快写了一个 C 程序来提取一组 gzipped 文件（包含大约 500,000 行）的 i-th line。这是我的 C 程序：

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <zlib.h>

/* compilation:
gcc  -o linesbyindex -Wall -O3 linesbyindex.c -lz
*/
#define MY_BUFFER_SIZE 10000000
static void extract(long int index,const char* filename)
   {
   char buffer[MY_BUFFER_SIZE];
   long int curr=1;
   gzFile in=gzopen (filename, "rb");
   if(in==NULL)
       {
       fprintf(stderr,"Cannot open \"%s\" %s.\n",filename,strerror(errno));
       exit(EXIT_FAILURE);              }
   while(gzread(in,buffer,MY_BUFFER_SIZE)!=-1 && curr<=index)
       {
       char* p=buffer;
       while(*p!=0)
           {
           if(curr==index)
               {
               fputc(*p,stdout);
               }
           if(*p=='\n')
               {
               ++curr;
               if(curr>index) break;
               }
           p++;
           }
       }
   gzclose(in);
   if(curr<index)
       {
       fprintf(stderr,"Not enough lines in %s (%ld)\n",filename,curr);
       }
   }

int main(int argc,char** argv)
   {
   int optind=2;
   char* p2;
   long int count=0;
   if(argc<3)
       {
       fprintf(stderr,"Usage: %s (count) files...\n",argv[0]);
       return EXIT_FAILURE;
       }
   count=strtol(argv[1],&p2,10);
   if(count<1 || *p2!=0)
       {
       fprintf(stderr,"bad number %s\n",argv[1]);
       return EXIT_SUCCESS;
       }
   while(optind< argc)
       {
       extract(count,argv[optind]);
       ++optind;
       }
   return EXIT_SUCCESS;
   }

作为测试，我在java中写了以下等效代码：

import java.io.*;
import java.util.zip.GZIPInputStream;

public class GetLineByIndex{
   private int index;

   public GetLineByIndex(int count){
       this.index=count;
   }

   private String extract(File file) throws IOException
       {
       long curr=1;
       byte buffer[]=new byte[2048];
       StringBuilder line=null;
       InputStream in=null;
       if(file.getName().toLowerCase().endsWith(".gz")){
           in= (new GZIPInputStream(new FileInputStream(file)));
       }else{
           in= (new FileInputStream(file));
       }
             int nRead=0;
       while((nRead=in.read(buffer))!=-1)
           {
           int i=0;
           while(i<nRead)
               {
               if(buffer[i]=='\n')
                   {
                   ++curr;
                   if(curr>this.index) break;
                                     }
               else if(curr==this.index)
                   {
                   if(line==null) line=new StringBuilder(500);
                   line.append((char)buffer[i]);
                   }
               i++;
               }
           if(curr>this.index) break;
           }
       in.close();
       return (line==null?null:line.toString());
       }

   public static void main(String args[]) throws Exception{
       int optind=1;
       if(args.length<2){
           System.err.println("Usage: program (count) files...\n");
           return;
       }
       GetLineByIndex app=new GetLineByIndex(Integer.parseInt(args[0]));

       while(optind < args.length)
           {
           String line=app.extract(new File(args[optind]));
           if(line==null)
               {
               System.err.println("Not enough lines in "+args[optind]);
               }
           else
               {
               System.out.println(line);
               }
           ++optind;
           }
       return;
   }
}

在同一台机器上，Java 程序（~1'45''）获取大索引比 C 程序（~2'15''）快得多（我多次运行该测试）。

我该如何解释这种差异？

【问题讨论】：

注意：缓冲区大小不相等，因此程序不会做“完全相同”的事情。
@SaniHuttunen - 代码不等价的原因不止于此：)
@Perception：是的，但这是我的第一次观察，似乎足以指出程序确实不相等。
C 实现在进程堆栈上实例化了一个 10Mb 数组。这真的会运行吗？大多数进程的堆栈都比这小。
也许编译器故意生成糟糕的代码是因为高度非正统的编码风格。如果我是一名编译器，我会这样做。

标签： java c performance optimization

【解决方案1】：

Java 版本比 C 版本快的最可能的解释是 C 版本不正确。

修复 C 版本后，我得到了以下结果（与您声称 Java 比 C 更快的说法相矛盾）：

Java 1.7 -client: 65 milliseconds (after JVM warmed up)
Java 1.7 -server: 82 milliseconds (after JVM warmed up)
gcc -O3:          37 milliseconds

任务是打印文件 words.gz 中的第 200000 行。文件words.gz 是通过压缩/usr/share/dict/words 生成的。

...
static char buffer[MY_BUFFER_SIZE];
...
ssize_t len;
while((len=gzread(in,buffer,MY_BUFFER_SIZE)) > 0  &&  curr<=index)
    {
    char* p=buffer;
    char* endp=buffer+len;
    while(p < endp)
       {
...

【讨论】：

请问你在 C 版本中做了什么改动？
谢谢！第一次编写 C 代码时，我使用 gzgets 而不是 gzread，但我没有更改缓冲区循环中的测试。
@Pierre：我明白了。如果你用你的文件在你的计算机上重新运行基准测试，现在 C 比 Java 快吗？
是的，我几个小时前在工作中进行了测试。但差异并没有我想象的那么大。
记录一下标准java的gzip是低效的。

【解决方案2】：

因为 fputc() 不是很快，而且您在输出文件中逐个字符地添加 stuf。

调用 fputc_unlocked 或者更确切地说分隔您要添加的内容并调用 fwrite() 应该更快。

【讨论】：

您的答案不正确。该问题的作者没有在他的 GZIP 文件中指定行的平均长度。
fputc() 仅在跳过大量假定相似的行后用于单行。不是我们应该寻找的内循环。巨大的自动缓冲区是更好的选择。使其与 java (2048) 中的大小相同，可以进行公平的比较。

【解决方案3】：

您的程序正在做不同的事情。我没有分析您的程序，但通过查看您的代码，我怀疑这种差异：

为了构建这条线，你在 Java 中使用它：

if(curr==this.index)
{
    if(line==null) line=new StringBuilder(500);
    line.append((char)buffer[i]);
}

这在 C 中：

if(curr==index)
{
    fputc(*p,stdout);
}

即您一次打印一个字符到标准输出。默认情况下是 buffere，但我怀疑它仍然比您在 Java 中使用的 500 个字符的缓冲区慢。

【讨论】：

【解决方案4】：

我对编译器执行哪些优化没有更深入的了解，但我猜这就是你的程序之间的区别所在。像这样的微基准测试非常、非常、非常难以正确和有意义。这是 Brian Goetz 的一篇文章，对此进行了详细说明：http://www.ibm.com/developerworks/java/library/j-jtp02225/index.html

【讨论】：

【解决方案5】：

非常大的缓冲区可能会更慢。我建议您使缓冲区大小相同。即 2 或 8 KB

【讨论】：

我开始使用 stdio: BUFSIZ : ~ 相同的结果
在 C (zlib) 中，大缓冲区根本不重要，在 java 中它确实如此，因为它被复制了多次。您也可以使用内存映射文件。 Java的FileInputStream（曾经？）针对Win中的2K，8K - linux中的较小缓冲区进行了优化，在这种情况下使用堆栈分配，否则它是malloc / free（并且一些malloc比堆栈慢得多），这就是较小缓冲区执行的原因更好的。当调用更深层次的递归时，我在本机内存中发生了可怕的崩溃，双 SIGSEG 并且进程已死（第二次发生在尝试写入崩溃日志时，因此没有崩溃日志事件）