【发布时间】:2019-09-01 19:04:08
【问题描述】:
我想写一个 C++17 并行执行算法,但是遇到了一些麻烦。让我们从代码开始:
#if __has_include(<execution>)
#include <execution>
#include <thread>
#include <future>
#endif
template<class RandomAccessIterator>
inline auto mean(RandomAccessIterator first, RandomAccessIterator last)
{
auto it = first;
auto mu = *first;
decltype(mu) i = 2;
while(++it != last)
{
mu += (*it - mu)/i;
i += 1;
}
return mu;
}
#if __has_include(<execution>)
template<class ExecutionPolicy, class RandomAccessIterator>
inline auto mean(ExecutionPolicy&& exec_pol, RandomAccessIterator first, RandomAccessIterator last) {
using Real = typename std::iterator_traits<RandomAccessIterator>::value_type;
//static_assert(std::is_execution_policy_v<ExecutionPolicy>, "First argument must be an execution policy.");
if (exec_pol == std::execution::par) {
size_t elems = std::distance(first, last);
if (elems*sizeof(Real) < /*guestimate*/ 4096) {
return mean(first, last);
}
unsigned threads = std::thread::hardware_concurrency();
if (threads == 0) {
threads = 2;
}
std::vector<std::future<Real>> futures;
size_t elems_per_thread = elems/threads;
auto it = first;
for (unsigned i = 0; i < threads -1; ++i) {
futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, it + elems_per_thread));
it += elems_per_thread;
}
futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, last));
Real mu = 0;
for (auto fut : futures) {
mu += fut.get();
}
mu /= threads;
return mu;
}
else { // should have else-if for various types of execution policies, but let's save that for later.
return mean(first, last);
}
}
#endif
好的,所以问题:
- 我首先通过
const &传递ExecutionPolicy参数。static_assert通过了,但后来我在if (exec_pol == std::execution::par)上挂了一个编译错误,即:
error: no match for ‘operator==’ (operand types are ‘const __pstl::execution::v1::parallel_policy’ and ‘const __pstl::execution::v1::parallel_policy’)
117 | if (exec_pol == std::execution::par) {
| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
然后我查看了/usr/include/c++/9/pstl/algorithm_impl.h,在其中,他们通过移动和转发ExecutionPolicy 并将其转发到各个地方,所以我想我应该这样做。但这并没有解决任何问题,所以我查看了/usr/include/c++/9/pstl/parallel_backend_tbb.h。在那个文件中,他们甚至不检查并行执行策略是什么!例如,上述文件中的几行:
//! Evaluation of brick f[i,j) for each subrange [i,j) of [first,last)
// wrapper over tbb::parallel_for
template <class _ExecutionPolicy, class _Index, class _Fp>
void
__parallel_for(_ExecutionPolicy&&, _Index __first, _Index __last, _Fp __f)
{
tbb::this_task_arena::isolate([=]() {
tbb::parallel_for(tbb::blocked_range<_Index>(__first, __last), __parallel_for_body<_Index, _Fp>(__f));
});
}
那么我是否从根本上误解了如何使用 C++17 并行执行策略编写并行算法?如果没有,如何检查执行策略并正确使用?
【问题讨论】:
-
是你真正想要的意思吗?然后这一行:
auto mean = std::reduce(std::execution::par, v.begin(), v.end()) / v.size();会做你想做的事。将标准算法用于各种执行策略可以并且将使您的生活更轻松。 -
@MichaëlRoy:我根本不在乎刻薄。这只是一个简单的例子供大家理解。
标签: parallel-processing c++17 tbb