您的代码
您对线程的使用通常看起来是正确的,因此问题很可能是generate_some_string 影响了全局状态。您可以通过以下任一方式解决此问题:
并行哲学
回想起来,上述内容似乎很明显,所以有一个问题是为什么它没有立即显现出来。我认为这与您实现并行性的方式有关。
C++11 线程为您提供了很大的灵活性,但它也要求您明确地构建并行性。大多数情况下,这不是你想要的。向编译器提供有关它如何并行化您的代码并让其处理低级细节的信息会更容易且错误更少。
下面展示了如何使用 OpenMP 来实现这一点:一套行业标准的编译器指令集,包含在所有现代编译器中,并广泛用于高性能计算。
您会注意到代码通常比您编写的代码更易于阅读,因此也更易于调试。
以下所有代码都将使用命令编译(针对您的编译器进行适当修改:
g++ -O3 main.cpp -fopenmp
解决方案 0:使用更简单的并行形式
首先,我建议您使用 OpenMP 来实现并行性。它是一个行业标准,消除了处理线程的大部分痛苦,并允许您在概念级别表达并行性。
解决方案 1:私有内存
您可以通过让每个线程写入自己的私有内存然后将私有内存合并在一起来解决您的问题。这完全避免了互斥体,这可能会导致更快的代码并可能完全避免您遇到的问题。
请注意,每个线程都生成多个计算密集型字符串,但这项工作会自动在可用线程之间分配。这是
#include <vector>
#include <string>
#include <omp.h>
#include <cmath>
#include <thread>
#include <chrono>
#include <iostream>
const int STRINGS_PER_LENGTH = 10;
const int MAX_STRING_LENGTH = 50;
using namespace std::chrono_literals;
//Computationally intensive string generation. Note that this function
//CANNOT have a global state, or the threads will maul it.
std::string GenerateSomeString(int length){
double sum=0;
for(int i=0;i<length;i++){
std::this_thread::sleep_for(2ms);
sum+=std::sqrt(i);
}
return std::to_string(sum);
}
int main(){
//Build a vector that contains vectors of strings. Each thread will have its
//own vector of strings
std::vector< std::vector<std::string> > vecs(omp_get_max_threads());
//Loop over lengths
for(int length=10;length<MAX_STRING_LENGTH;length++){
//Progress so the user does not get impatient
std::cout<<length<<std::endl;
//Parallelize across all cores
#pragma omp parallel for
for(int i=0;i<STRINGS_PER_LENGTH;i++){
//Each thread independently generates its string and puts it into its own
//private memory space
vecs[omp_get_thread_num()].push_back(GenerateSomeString(length));
}
}
//Merge all the threads' results together
std::vector<std::string> S;
for(auto &v: vecs)
S.insert(S.end(),v.begin(),v.end());
//Throw away the thread private memory
vecs.clear();
vecs.shrink_to_fit();
}
解决方案 2:减少使用量
我们可以定义一个自定义归约运算符来合并向量。在我们代码的并行部分中使用这个运算符可以让我们消除向量的向量和之后的清理。相反,随着线程完成它们的工作,OpenMP 会安全地处理它们的结果组合。
#include <vector>
#include <string>
#include <omp.h>
#include <cmath>
#include <thread>
#include <chrono>
#include <iostream>
using namespace std::chrono_literals;
const int STRINGS_PER_LENGTH = 10;
const int MAX_STRING_LENGTH = 50;
//Computationally intensive string generation. Note that this function
//CANNOT have a global state, or the threads will maul it.
std::string GenerateSomeString(int length){
double sum=0;
for(int i=0;i<length;i++){
std::this_thread::sleep_for(2ms);
sum+=std::sqrt(i);
}
return std::to_string(sum);
}
int main(){
//Global vector, must not be accessed by individual threads
std::vector<std::string> S;
#pragma omp declare reduction (merge : std::vector<std::string> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))
//Loop over lengths
for(int length=10;length<50;length++){
//Progress so the user does not get impatient
std::cout<<length<<std::endl;
//Parallelize across all cores
std::vector<std::string> private_memory;
#pragma omp parallel for reduction(merge: private_memory)
for(int i=0;i<STRINGS_PER_LENGTH;i++){
//Each thread independently generates its string and puts it into its own
//private memory space
private_memory.push_back(GenerateSomeString(length));
}
}
}
解决方案 3:使用critical
我们可以通过将push_back 放入一个临界区来完全消除这种减少,这会将对该部分代码的访问限制为一次只能访问一个线程。
//Compile with g++ -O3 main.cpp -fopenmp
#include <vector>
#include <string>
#include <omp.h>
#include <cmath>
#include <thread>
#include <chrono>
#include <iostream>
using namespace std::chrono_literals;
const int STRINGS_PER_LENGTH = 10;
const int MAX_STRING_LENGTH = 50;
//Computationally intensive string generation. Note that this function
//CANNOT have a global state, or the threads will maul it.
std::string GenerateSomeString(int length){
double sum=0;
for(int i=0;i<length;i++){
std::this_thread::sleep_for(2ms);
sum+=std::sqrt(i);
}
return std::to_string(sum);
}
int main(){
//Global vector, must not be accessed by individual threads
std::vector<std::string> S;
//Loop over lengths
for(int length=10;length<50;length++){
//Progress so the user does not get impatient
std::cout<<length<<std::endl;
//Parallelize across all cores
#pragma omp parallel for
for(int i=0;i<STRINGS_PER_LENGTH;i++){
//Each thread independently generates its string and puts it into its own
//private memory space
const auto temp = GenerateSomeString(length);
//Only one thread can access this part of the code at a time
#pragma omp critical
S.push_back(temp);
}
}
}