std::accumulate C++20 版本答案

【问题标题】：std::accumulate C++20 versionstd::accumulate C++20 版本
【发布时间】：2020-10-05 22:18:44
【问题描述】：

我试图理解这段代码，但我不知道为什么是这个版本

for (; first != last; ++first) 
    init = std::move(init) + *first;

比这更快

for (; first != last; ++first)
    init += *first;

我确实从 std::accumulate 拿走了它们。第一个版本的汇编代码比第二个版本长。即使第一个版本创建了 init 的右值引用，它总是通过添加 *first 然后将其分配给 init 来创建临时值，这与在第二种情况下创建临时值然后将其分配给 init 的过程相同。那么，为什么使用 std::move 比使用 += 运算符的“附加值”更好呢？

编辑

我在看c++20版本的accumulate的代码，他们说c++20之前的accumulate是这样的

template<class InputIt, class T>
T accumulate(InputIt first, InputIt last, T init)
{
    for (; first != last; ++first) {
        init = init + *first;
    }
    return init;
}

在 C++20 之后变成了

template<class InputIt, class T>
constexpr // since C++20
T accumulate(InputIt first, InputIt last, T init)
{
    for (; first != last; ++first) {
        init = std::move(init) + *first; // std::move since C++20
    }
    return init;
}

我只是想知道，使用 std::move 是否有任何真正的改进。

EDIT2

好的，这是我的示例代码：

#include <utility>
#include <chrono>
#include <iostream>

using ck = std::chrono::high_resolution_clock;

std::string
test_no_move(std::string str) {

    std::string b = "t";
    int count = 0;

    while (++count < 100000)
        str = std::move(str) + b;   // Without std::move

    return str;
}

std::string
test_with_move(std::string str) {

    std::string b = "t";
    int count = 0;

    while (++count < 100000)        // With std::move
        str = str + b;

    return str;

}

int main()
{
    std::string result;
    auto start = ck::now();
    result = test_no_move("test");
    auto finish = ck::now();

    std::cout << "Test without std::move " << std::chrono::duration_cast<std::chrono::microseconds>(finish - start).count() << std::endl;

    start = ck::now();
    result = test_with_move("test");
    finish = ck::now();

    std::cout << "Test with std::move " << std::chrono::duration_cast<std::chrono::microseconds>(finish - start).count() << std::endl;

    return 0;
}

如果你运行它，你会发现 std::move 版本确实比另一个快，但是如果你尝试使用内置类型，你会得到 std::move 版本比另一个慢。

所以我的问题是，既然这种情况可能与 std::accumulate 相同，为什么他们说带有 std::move 的 C++20 累积版本比没有它的版本快？为什么将 std::move 与字符串之类的东西一起使用，我得到了这样的改进，但不使用 int 之类的东西？为什么这一切，如果在这两种情况下，程序创建一个临时字符串 str + b（或 std::move(str) + b）然后移动到 str？我的意思是，这是相同的操作。为什么第二个更快？

感谢您的耐心等待。希望这次我把自己说清楚了。

【问题讨论】：

这取决于init的类型，如果该类型没有实现任何带有右值引用的重载，那么这个值是一样的，所以请提供一个完整的例子
第二版本从未在C ++标准中。 std::accumulate 始终使用operator+() 或BinaryOperation 模板参数运行。
std::accumulate 是一个模板，因此在查看汇编之前需要执行几个步骤。你能加入minimal reproducible example吗？
发布您的完整基准代码以及您如何编译和运行它。
那么请添加minimal reproducible example。你怎么称呼这些方法？你使用了什么编译器和编译器选项？

标签： c++ c++11 std c++20 accumulate

【解决方案1】：

对于具有非平凡移动语义的类型，它可能更快。考虑积累足够长的字符串std::vector<std::string>：

std::vector<std::string> strings(100, std::string(100, ' '));

std::string init;
init.reserve(10000);
auto r = accumulate(strings.begin(), strings.end(), std::move(init));

对于没有std::move 的accumulate，

std::string operator+(const std::string&, const std::string&);

将被使用。在每次迭代时，它都会在堆上为生成的字符串分配存储空间，以便在下一次迭代时将其丢弃。

对于accumulate 和std::move，

std::string operator+(std::string&&, const std::string&);

将被使用。与前一种情况相比，第一个参数的缓冲区可以重复使用。如果初始字符串有足够的容量，则在累积过程中不会分配额外的内存。

Simple demo

without std::move
n_allocs = 199

with std::move
n_allocs = 0

对于像int 这样的内置类型，移动只是一个副本——没有什么可以移动的。对于优化的构建，您很可能会得到完全相同的汇编代码。如果您的基准测试显示任何速度提高/下降，很可能您没有正确执行（没有优化、噪音、代码优化等）。

【讨论】：