在多线程程序中使用 std::cout 和 <iomanip> 时如何避免数据竞争？答案

【问题标题】：How to avoid data races when using std::cout and <iomanip> in multithreaded programs?在多线程程序中使用 std::cout 和 <iomanip> 时如何避免数据竞争？
【发布时间】：2020-01-22 11:34:58
【问题描述】：

这是我第一次尝试编写多线程 C++ 代码，它似乎引发了数据竞争。这是完整的文件。编译为：g++ -pthread foo.cpp

#include <iostream>
#include <iomanip>
#include <thread>
const int SIZE = 5;

void mult(int x, int y) {
    std::cout.width(3); 
    std::cout << std::right << x * y << "* ";
}

void add(int x, int y) {
    std::cout.width(3); 
    std::cout << std::right << x + y << "+ ";
}

int main() {
    int a = 0;
    for (int i = 0; i < SIZE; i++) {
        for (int j = 0; j < SIZE; j++) {
            std::thread first(mult, i, j);
            std::thread second(add, i, j);
            first.join();
            second.join();
            std::cout << " | ";
        }
        std::cout << "\n";
    }
     return 0;
}

输出在每次运行时都以不可重现的方式打乱，例如：

  0*   0+  |   0*   1+  |   2  0+ *  |   0*   3+  |   0*   4+  | 
  0*   1+  |   1*   2+  |   2*   3+  |   3*   4+  |   4*   5+  | 
  0*   2+  |   2*   3+  |   4*   4+  |   6*   5+  |   8*   6+  | 
  0*   3+  |   3  4* +  |   6*   5+  |   9*   6+  |  12*   7+  | 
  0*   4+  |   4*   5+  |   8*   6+  |  12*   7+  |  16*   8+  |

或

  0*   0+  |   0*   1+  |   0*   2+  |   0*   3+  |   0*   4+  | 
  0*   1+  |   1*   2+  |   2*   3+  |   3*   4+  |   4*   5+  | 
  0*   2+  |   2*   3+  |   4*   4+  |   6*   5+  |   8*   6+  | 
  0*   3+  |   3*   4+  |   6*   5+  |   9* 6+  |  12*   7+  | 
  0*   4+  |   4*   5+  |   8*   6+  |  12*   7+  |  16*   8+  |

有没有办法解决这个问题？我从中学到了很多关于 cout 对象的知识，但是规则是否应该一次只允许一个线程访问 cout，尤其是在使用 iomanip 时？

编辑：我理解为： http://www.cplusplus.com/reference/iomanip/setw/ 以这种方式使用 iomanip 可能会导致数据竞争。所以问题是，不应该尝试这样做吗？是否应该创建每个要 cout 的线程，执行其业务，然后加入？（即根本没有线程），就是这样？如果是这样，那很好，并发的主要思想更多是让程序打开多个并发的 fstream 对象，这样用户就不必等待，一个线程 cout 就可以了。我要问的是，这是标准方法吗？

【问题讨论】：

纠正多线程非交错输出的答案非常复杂。我知道 Herb Sutter 在 YouTube 上有一个很棒的视频来处理这个问题。
Why is my program printing garbage?的可能重复
您介意在每个部分中先打印乘法还是除法？如果这样做，则将 IO 放在单独的线程中根本没有意义，让线程计算结果，然后按所需的顺序打印它们。
至于交错，我建议有一个单独的函数，其中包含所有 iostream 和 iomanip 功能，由 std::mutex 通过 std::lock_guard 保护

标签： c++ c++11

【解决方案1】：

您可以使用std::mutex 和std::lock_guard：

#include <iomanip>
#include <iostream>
#include <mutex>
#include <thread>
const int SIZE = 5;

std::mutex iomutex;

void mult(int x, int y) {
    // Complex, time-consuming calculations run multithreaded
    auto res = x * y;
    // lock stops other threads at this point
    std::lock_guard<std::mutex> lock(iomutex);
    // IO is singlethreaded
    std::cout.width(3); 
    std::cout << std::right << res << "* ";
    // lock leaves scope and is unlocked, next thread can start IO
}

void add(int x, int y) {
    // Complex, time-consuming calculations run multithreaded
    auto res = x + y;
    // lock stops other threads at this point
    std::lock_guard<std::mutex> lock(iomutex);
    // IO is singlethreaded
    std::cout.width(3); 
    std::cout << std::right << res << "+ ";
    // lock leaves scope and is unlocked, next thread can start IO
}

int main() {
    for (int i = 0; i < SIZE; i++) {
        for (int j = 0; j < SIZE; j++) {
            std::thread first(mult, i, j);
            std::thread second(add, i, j);
            first.join();
            second.join();
            std::cout << " | ";
        }
        std::cout << "\n";
    }
     return 0;
}

在这个例子中，多线程没有意义，但在更大的例子中，你只会保护输入/输出。计算并行运行。

【讨论】：

不防止第二个线程在第一个线程之前锁定互斥锁。不太可能发生，但仍然完全有可能。
@Graeme 据我了解，操作和输出的顺序无关紧要。在这个问题中唯一重要的是一个线程在另一个线程输出时不开始输出。
谢谢，我的脑袋爆炸了，显然我还有更多工作要做。我真正担心的是在 fstream 中使用线程，它们看起来比使用 cout 更有用。显然，Cout 是一个一次性的程序生命周期静态对象，您可以创建多个 fstream 对象，所以这可能不是问题。

【解决方案2】：

在这种情况下，最好只从主线程中完成所有输出：

#include <iostream>
#include <iomanip>
#include <thread>
const int SIZE = 5;

void mult(int &res, int x, int y) {
    res = x * y;
}

void add(int &res, int x, int y) {
    res = x + y;
}

int main() {
    int a = 0;
    for (int i = 0; i < SIZE; i++) {
        for (int j = 0; j < SIZE; j++) {
            int mult_res, add_res;
            std::thread first(mult, std::ref(mult_res), i, j);
            std::thread second(add, std::ref(add_res), i, j);
            first.join();
            second.join();
            std::cout.width(3);
            std::cout << std::right << mult_res << "* ";
            std::cout.width(3);
            std::cout << std::right << add_res << "+ | " ;
        }
        std::cout << "\n";
    }
    return 0;
}

【讨论】：

这有点像 first.join 紧跟在 std::thread 之后，不是吗？然后就没有问题了，一切都很好，但也不需要。所以，在大多数情况下，您只需要一个线程即可？
@neutrino_logic，我假设在一个真实的例子中，计算将是最复杂的部分，优势来自于并行化它们。如果您愿意，您还可以在线程内格式化结果并将字符串返回到输出而不是整数。
@neutrino_logic 归根结底，这取决于您可以容忍的表格变化。这命令一切。另一种方法可能是创建线程以输出不同的每一行，然后在输出该行时锁定一个互斥锁。这样，线条可能会乱七八糟，但其余的都很好。
哦，这很有道理......虽然我公认的有限理解是线程在这个实现中的一个处理器上同时运行，真正的并行化来自通过 GPU 或类似的东西运行这些线程？
@neutrino_logic，真正的并行化可以发生在一个多核 CPU 上，不需要 GPU。如果有多核 CPU，这将在这里发生。如果它在具有适当操作系统的多处理器系统上运行，也会发生这种情况。