如何使用 cblas 函数计算向量中元素值的总和？答案

【问题标题】：How to compute the sum of the values of elements in a vector using cblas functions?如何使用 cblas 函数计算向量中元素值的总和？
【发布时间】：2016-08-01 18:47:13
【问题描述】：

我需要对caffe中一个矩阵的所有元素求和，

但正如我所注意到的，cblas 函数 ('math_functions.hpp' & 'math_functions.cpp') 的 caffe 包装器使用 cblas_sasum 函数作为计算的 caffe_cpu_asum向量中元素的绝对值之和。

由于我是cblas的新手，我试图找到一个合适的函数来摆脱那里的absolute，但是cblas中似乎没有该属性的函数。

有什么建议吗？

【问题讨论】：

标签： neural-network caffe blas conv-neural-network cblas

【解决方案1】：

有一种使用 cblas 函数的方法，虽然有点尴尬。

你需要做的是定义一个“全1”的向量，然后在这个向量和你的矩阵之间做一个点积，结果就是总和。

让myBlob 成为你想要对其元素求和的caffe Blob：

vector<Dtype> mult_data( myBlob.count(), Dtype(1) );
Dtype sum = caffe_cpu_dot( myBlob.count(), &mult_data[0], myBlob.cpu_data() );

这个技巧用在implementation of "Reduction" layer中。

要使这个答案同时符合 GPU，需要为mult_data 而不是std::vector 分配一个Blob（因为您需要它是pgu_data()）：

vector<int> sum_mult_shape(1, diff_.count());
Blob<Dtype> sum_multiplier_(sum_mult_shape);
const Dtype* mult_data = sum_multiplier_.cpu_data();
Dtype sum = caffe_cpu_dot( myBlob.count(), &mult_data[0], myBlob.cpu_data() );

对于 GPU，（在 '.cu' 源文件中）：

vector<int> sum_mult_shape(1, diff_.count());
Blob<Dtype> sum_multiplier_(sum_mult_shape);
const Dtype* mult_data = sum_multiplier_.gpu_data();
Dtype sum;
caffe_gpu_dot( myBlob.count(), &mult_data[0], myBlob.gpu_data(), &sum );

【讨论】：

非常感谢。

【解决方案2】：

数组中所有元素的求和非常简单，可以通过单个 for 循环来实现。您只需要使用适当的编译选项通过 SIMD 指令对其进行矢量化。

对于caffe中的Blob，可以使用.cpu_data()获取数组的原始指针，然后使用for循环。

【讨论】：

感谢您的重播，其实很多 cblas 函数都在高效地进行过于简单的操作，但是我必须在 caffe 中进行，而不使用简单的 for 循环以防止性能不佳。顺便问一下，我怎样才能遍历caffe中Blob的data_变量中的所有值？
for-loop 在这方面有很好的表现。您可以使用适当的编译选项对其进行矢量化。