std::async 和 std::lock_guard 的问题 - C++ 中的不同增量值答案

【问题标题】：Problem with std::async, and std::lock_guard - varying incremented values in C++std::async 和 std::lock_guard 的问题 - C++ 中的不同增量值
【发布时间】：2021-11-08 08:09:07
【问题描述】：

我一直在尝试std::async。我写了一个（非常低效的）函数，它将所有素数相加到给定的限制。

如果没有std::async，该函数总是给出由ctest 单元测试判断的预期结果。使用std::async 但不使用std::lock_guard / mutex，预期值和计算值之间存在（如预期的那样）很大的偏差。使用std::lock_guard / mutex，结果仍然无法重现，并且在大约 50% 的测试中给出了正确的结果，在另外 50% 的测试中，该值相差 1-2（例如 9151 而不是 9152）。

我正在争论是否存在问题，因为我的isPrime(int) 函数引入了另一个瓶颈。或者，我认为程序（有时）在最后一个线程完成工作之前提前终止。

无论哪种情况，我都不明白为什么std::lock_guard 似乎不能保护变量计数。

// primeAsync.cc

#include <future>
#include <vector>

#include "primeAsync.h"
#include "prime.h"

// mutex for thread safety
static std::mutex s_PrimeMutex;

// handle to store futures
static std::vector<std::future<void>> s_Futures;

static void countPrimesHelper(int* count, const long long number);

// passes ctest every other time, count is off by 1-2 (e.g. 9151 instead of 9152)
int countPrimesAsync(const long long limit) {
    int count = 0;
    for(long i = 2; i < limit; ++i) {
        s_Futures.push_back(std::async(std::launch::async, countPrimesHelper, &count, i));
    }
    return count;
}

// helper function for countPrimesAsync, std::async
static void countPrimesHelper(int* count, const long long number) {
    if(isPrime(number)) {
        const std::lock_guard<std::mutex> lock(s_PrimeMutex);
        ++(*count);
    }
}

// prime.cc

#include <iostream>
#include <vector>

// always passes ctest
bool isPrime(const long long n) {
    int mod;
    for(long long m = n - 1; m > 1; --m) {
        mod = n % m;
        if(mod == 0) {
            return false;
        }
    }
    return true;
}

// always passes ctest
int countPrimes(const long long limit) {
    int count = 0;
    for(long i = 2; i < limit; ++i) {
        if(isPrime(i)) {
            count++;
        }
    }
    return count;
}

【问题讨论】：

你在哪里等待向量中的所有期货完成？
线程消毒剂detects a data race here。你可能在当地有更好的象征运气。
您的计数变量是函数中的局部变量，它启动大量线程然后退出。因此，所有这些线程递增的计数指针指向的变量不再存在。你很幸运，它完全有效。在计数变量停止存在之前等待所有线程完成是正确的想法。

标签： c++ multithreading mutex stdasync

【解决方案1】：

人们没有提到的另一件事是，原子变量允许您拥有一个可以由多个线程访问和写入的值，而不会出现竞争条件，并且比锁守卫要快得多。

例如：

#include <atomic>
std::atomic<int> count;

static void countPrimesHelper(std::atomic<int>* count, const long long number) {
    if(isPrime(number)) {
        (*count) += 1;
    }
}

这是无竞争条件的，即使有 100 万个线程访问计数。它通过将增量作为一个在不完整状态下无法访问的操作来工作。如果您有最近的处理器，它还可以防止您的处理器并行执行此操作。（使用持续一次操作的低级锁定指令）

这里是关于原子头的更多信息：https://en.cppreference.com/w/cpp/atomic

【讨论】：

@François Andrieux 我同意，我很快在countPrimesHelper 中添加了原子解决方案。我相信，该函数仍然无法通过测试，即使 atomic 可能比 lock guard 更快，但我面临的主要问题是程序终止时 future 还没有准备好。
@iridiumcc 是的，这个答案是一个明显的改进，但并没有解决核心问题。

【解决方案2】：

我将函数更改为在返回计数之前等待所有期货完成：

// now seems to pass ctest every time
int countPrimesAsync(const long long limit) {
    int count = 0;
    for(long i = 2; i < limit; ++i) {
        s_Futures.push_back(std::async(std::launch::async, countPrimesHelper, &count, i));
    }

    for(long i = 0; i < s_Futures.size(); ++i) {
        s_Futures[i].wait();
    }

    return count;
}

该解决方案通过了测试！

另外，我找到了以下解决方案来检查未来的状态：

Get the status of a std::future

不确定哪个是首选（如果有的话）。

谢谢！

与 François Andrieux 一致，这个解决方案也可以工作并通过 ctest，不需要静态变量，我认为比第二个 for 循环更优雅：

// now seems to pass ctest every time
int countPrimesAsync(const long long limit) {
    // handle to store futures
    std::vector<std::future<void>> s_Futures;
    int count = 0;
    for(long i = 2; i < limit; ++i) {
        s_Futures.push_back(std::async(std::launch::async, countPrimesHelper, &count, i));
    }
    s_Futures.clear(); 
    return count;
}

【讨论】：

请注意，std::async 返回的 std::future 具有其析构函数将阻塞的属性，直到关联的 async 函数完成。例如，您可以使用s_Futures.clear()。我还建议您考虑将s_Futures 移动为countPrimesAsync 中的局部变量。