将文件内容读入 C++ 中的字符串 [重复]答案

【问题标题】：Read file-contents into a string in C++ [duplicate]将文件内容读入 C++ 中的字符串 [重复]
【发布时间】：2011-02-24 03:31:00
【问题描述】：

可能重复：
What is the best way to slurp a file into a std::string in c++?

在 Perl 等脚本语言中，可以一次性将文件读入变量。

    open(FILEHANDLE,$file);
    $content=<FILEHANDLE>;

在 C++ 中最有效的方法是什么？

【问题讨论】：

“高效”是什么意思？
高效 = 快速且不占用太多内存。
类似问题：Read whole ASCII file into C++ std::string

标签： c++ string file-io

【解决方案1】：

像这样：

#include <fstream>
#include <string>

int main(int argc, char** argv)
{

  std::ifstream ifs("myfile.txt");
  std::string content( (std::istreambuf_iterator<char>(ifs) ),
                       (std::istreambuf_iterator<char>()    ) );

  return 0;
}

声明

  std::string content( (std::istreambuf_iterator<char>(ifs) ),
                       (std::istreambuf_iterator<char>()    ) );

可以拆分成

std::string content;
content.assign( (std::istreambuf_iterator<char>(ifs) ),
                (std::istreambuf_iterator<char>()    ) );

如果您只想覆盖现有 std::string 变量的值，这很有用。

【讨论】：

+1 非常符合 C++ 习惯。在具有 gcc 4.4 的 Linux 中，生成的系统调用非常高效，一次读取 8k 文件。
如果文件大小已知，可以在读取文件前调用std::string::reserve方法分配空间。这应该会加快执行速度。为字符串重新分配内存会浪费很多时间。
+1，但为什么迭代器必须括在括号中？它们看起来无害，但没有它们就无法编译。
Qix：这是“经典”的 c++ 解析问题，称为 Most Vexing Parse: en.wikipedia.org/wiki/Most_vexing_parse
我尝试了上面的 content.assign( (std::istreambuf_iterator(ifs) ), 命令。很好地阅读了所有文件，但是当我显示内容时，它在开头显示无效字符. 为什么 ?? Help. 'code' string content; ifstream myfile("textFile.txt"); content.assign( (istreambuf_iterator(myfile) ), (istreambuf_iterator() ) ); cout

【解决方案2】：

最有效但不是 C++ 的方式是：

   FILE* f = fopen(filename, "r");

   // Determine file size
   fseek(f, 0, SEEK_END);
   size_t size = ftell(f);

   char* where = new char[size];

   rewind(f);
   fread(where, sizeof(char), size, f);

   delete[] where;

`#`EDIT - 2

刚刚还测试了std::filebuf 变体。看起来它可以被称为最好的 C++ 方法，即使它不是 C++ 方法，而更像是一个包装器。不管怎样，这里的代码块几乎和普通 C 一样快。

   std::ifstream file(filename, std::ios::binary);
   std::streambuf* raw_buffer = file.rdbuf();

   char* block = new char[size];
   raw_buffer->sgetn(block, size);
   delete[] block;

我在这里做了一个快速基准测试，结果如下。测试是在使用适当的（std::ios:binary 和 rb）模式读取 65536K 二进制文件时完成的。

[==========] Running 3 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 4 tests from IO
[ RUN      ] IO.C_Kotti
[       OK ] IO.C_Kotti (78 ms)
[ RUN      ] IO.CPP_Nikko
[       OK ] IO.CPP_Nikko (106 ms)
[ RUN      ] IO.CPP_Beckmann
[       OK ] IO.CPP_Beckmann (1891 ms)
[ RUN      ] IO.CPP_Neil
[       OK ] IO.CPP_Neil (234 ms)
[----------] 4 tests from IO (2309 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 1 test case ran. (2309 ms total)
[  PASSED  ] 4 tests.

【讨论】：

不错的基准，我对这些数字并不感到惊讶。对于使用良好的旧 C io 的普通 ascii 文件的最大性能是要走的路。 C++ 流是不匹配的。但是，它们不太容易出错。只要他们在分析时没有出现，我宁愿使用它们。
哇，真的很酷。我不知道为什么，但一开始我并不相信你。看起来是结合 iostream 功能和原始 C 文件读取速度的最佳方式。
呵呵.. 纯 C 还是更快 ;)
@Constantino，您确定文件长度的方法不正确。尽管 fstat/rewing 组合有效，但正确的方法是填充 stat 结构并提取 st_size 成员。最好是在安全方面。
你怎么在这里找到大小？！

【解决方案3】：

最高效的方法是创建一个正确大小的缓冲区，然后将文件读入缓冲区。

#include <fstream>
#include <vector>

int main()
{
    std::ifstream       file("Plop");
    if (file)
    {
        /*
         * Get the size of the file
         */
        file.seekg(0,std::ios::end);
        std::streampos          length = file.tellg();
        file.seekg(0,std::ios::beg);

        /*
         * Use a vector as the buffer.
         * It is exception safe and will be tidied up correctly.
         * This constructor creates a buffer of the correct length.
         * Because char is a POD data type it is not initialized.
         *
         * Then read the whole file into the buffer.
         */
        std::vector<char>       buffer(length);
        file.read(&buffer[0],length);
    }
}

【讨论】：

基准测试？甚至是 strace ......（不是我不相信这是最快的，我想知道它实际上是否与基于迭代器的方法有任何不同）
此方法不保证有效。 tellg 未指定以字节为单位将偏移量返回到文件中 - 它只是一个不透明的标记。 See this answer 以获得更详细的解释。
在文本模式下，在执行文件翻译的操作系统上，tellg 的结果很可能与可读取的字符数不匹配

【解决方案4】：

文本文件中不应有\0。

#include<iostream>
#include<fstream>

using namespace std;

int main(){
  fstream f(FILENAME, fstream::in );
  string s;
  getline( f, s, '\0');

  cout << s << endl;
  f.close();
}

【讨论】：

问题没有提到文本文件。
-1 这个例子只读了一行，我想知道是否有节制。
@piotr 这个例子读取整个文本文件，经过测试。
我想每个人都认为这将是一个文本文件，但事实并非如此。就代码而言：直接使用 ifstream("filename") 可能更清楚。您不需要关闭文件，它会自动完成。它确实会读取文本文件。
@Draco Ater 我用二进制文件进行了测试，可能它只读取到\0。关键是这个例子将处理每个字符，我更喜欢基于迭代器的解决方案，它可能更有效。

【解决方案5】：

这取决于很多事情，例如文件的大小、文件的类型（文本/二进制）等。前段时间，我使用 streambuf 迭代器针对版本对以下函数进行了基准测试——大约是两倍一样快：

unsigned int FileRead( std::istream & is, std::vector <char> & buff ) {
    is.read( &buff[0], buff.size() );
    return is.gcount();
}

void FileRead( std::ifstream & ifs, string & s ) {
    const unsigned int BUFSIZE = 64 * 1024; // reasoable sized buffer
    std::vector <char> buffer( BUFSIZE );

    while( unsigned int n = FileRead( ifs, buffer ) ) {
        s.append( &buffer[0], n );
    }
}

【讨论】：

【解决方案6】：

可能不是最高效的，但是一行读取数据：

#include<iostream>
#include<vector>
#include<iterator>

main(int argc,char *argv[]){
  // read standard input into vector:
  std::vector<char>v(std::istream_iterator<char>(std::cin),
                     std::istream_iterator<char>());
  std::cout << "read " << v.size() << "chars\n";
}

【讨论】：

【解决方案7】：

这是一个基于迭代器的方法。

ifstream file("file", ios::binary);
string fileStr;

istreambuf_iterator<char> inputIt(file), emptyInputIt
back_insert_iterator<string> stringInsert(fileStr);

copy(inputIt, emptyInputIt, stringInsert);

【讨论】：

#EDIT - 2

`#`EDIT - 2