由具有不同数量参数的其他函数参数化的函数模板答案

【问题标题】：function template parametrized by other function with different number of arguments由具有不同数量参数的其他函数参数化的函数模板
【发布时间】：2015-06-11 12:15:20
【问题描述】：

我可以使函数模板由其他函数参数化，但是，当我想通过具有不同数量参数的函数对其进行参数化时，我不知道该怎么做。

查看此代码：

#include <stdio.h>
#include <math.h>

template < double FUNC( double a ) >
void seq_op( int n, double * as ){
    for (int i=0; i<n; i++){  printf( " %f \n", FUNC( as[i] )  ); }
} 

template < double FUNC( double a, double b ) >
void seq_op_2( int n, double * as, double * bs ){
    for (int i=0; i<n; i++){  printf( " %f \n", FUNC( as[i], bs[i] )  ); }
} 

double a_plus_1  ( double a ){ return a + 1.0; }
double a_sq      ( double a ){ return a*a;     }

double a_plus_b ( double a, double b ){ return a + b; }
double a_times_b( double a, double b ){ return a * b; }


double as[5] = {1,2,3,4};
double bs[5] = {2,2,2,2};

// FUNCTION ======  main
int main(){
    printf( "seq_op   <a_plus_1>  ( 5, as );\n");      seq_op   <a_plus_1>  ( 4, as );
    printf( "seq_op   <a_sq>      ( 5, as );\n");      seq_op   <a_sq>      ( 4, as );
    printf( "seq_op_2 <a_plus_b>  ( 5, as, bs );\n");  seq_op_2 <a_plus_b>  ( 4, as, bs );
    printf( "seq_op_2 <a_times_b> ( 5, as, bs );\n");  seq_op_2 <a_times_b> ( 4, as, bs );
}

有没有办法为这两种情况制作通用模板？

为什么我需要这么傻的东西？一个更实际的例子是这两个函数，它们只有一行不同：

#define i3D( ix, iy, iz )  ( iz*nxy + iy*nx + ix  ) 

void getLenardJonesFF( int natom, double * Rs_, double * C6, double * C12 ){
    Vec3d * Rs = (Vec3d*) Rs_;
    int nx  = FF::n.x;
    int ny  = FF::n.y;
    int nz  = FF::n.z;
    int nxy = ny * nx;
    Vec3d rProbe;  rProbe.set( 0.0, 0.0, 0.0 ); // we may shift here
    for ( int ia=0; ia<nx; ia++ ){ 
        printf( " ia %i \n", ia );
        rProbe.add( FF::dCell.a );  
        for ( int ib=0; ib<ny; ib++ ){ 
            rProbe.add( FF::dCell.b );
            for ( int ic=0; ic<nz; ic++ ){
                rProbe.add( FF::dCell.c );
                Vec3d f; f.set(0.0,0.0,0.0);
                for(int iatom=0; iatom<natom; iatom++){
                    // only this line differs
                    f.add( forceLJ( Rs[iatom] - rProbe, C6[iatom], C12[iatom] ) );
                }
                FF::grid[ i3D( ia, ib, ic ) ].add( f );
            } 
            rProbe.add_mul( FF::dCell.c, -nz );
        } 
        rProbe.add_mul( FF::dCell.b, -ny );
    }
}

void getCoulombFF( int natom, double * Rs_, double * kQQs ){
    Vec3d * Rs = (Vec3d*) Rs_;
    int nx  = FF::n.x;
    int ny  = FF::n.y;
    int nz  = FF::n.z;
    int nxy = ny * nx;
    Vec3d rProbe;  rProbe.set( 0.0, 0.0, 0.0 ); // we may shift here
    for ( int ia=0; ia<nx; ia++ ){ 
        printf( " ia %i \n", ia );
        rProbe.add( FF::dCell.a );  
        for ( int ib=0; ib<ny; ib++ ){ 
            rProbe.add( FF::dCell.b );
            for ( int ic=0; ic<nz; ic++ ){
                rProbe.add( FF::dCell.c );
                Vec3d f; f.set(0.0,0.0,0.0);
                for(int iatom=0; iatom<natom; iatom++){
                    // only this line differs
                    f.add( forceCoulomb( Rs[iatom] - rProbe, kQQs[iatom] );
                }
                FF::grid[ i3D( ia, ib, ic ) ].add( f );
            } 
            rProbe.add_mul( FF::dCell.c, -nz );
        } 
        rProbe.add_mul( FF::dCell.b, -ny );
    }
}

【问题讨论】：

我认为你能得到的最多就是重载，所以只需将两个函数命名为 seq_op，但仍然具有不同的实现
我尝试将它们的实现放在一起，而不是拥有两个（或更多）差异很小的复杂函数。可以使用默认参数（恕我直言，在您的情况下是最佳选择）、bool 或 enum 标志、函数指针等来实现。
是我，还是 c++11 中添加的 lambdas 来解决这类问题？
Melebius > 出于性能原因，我不想将任何不必要的switch 或if 放在最里面的循环中。即使是函数指针，我也发现它不如模板或宏在编译时得到优化。可能可以通过宏来解决，但我发现模板更漂亮的编码风格和更具可读性。

标签： c++ templates metaprogramming

【解决方案1】：

您应该能够使用std::bind() 和std::function() 的组合来组合这两个函数（请参阅code on coliru）：

#include <stdio.h>
#include <functional>

using namespace std::placeholders;


double getLJForceAtoms (int, int, double*, double*, double*)
{
    printf("getLJForceAtoms\n");
    return 0;
}


double getCoulombForceAtoms (int, int, double*, double*)
{
    printf("getCoulombForceAtoms\n");
    return 0;
}


void getFF (int natom, double* Rs_, std::function<double(int, int, double*)> GetForce)
{
    int rProbe = 1;

    double Force = GetForce(rProbe, natom, Rs_);
}


int main ()
{
    double* C6 = nullptr;
    double* C12 = nullptr;
    double *kQQs = nullptr;
    double* Rs_ = nullptr;

    auto getLJForceFunc = std::bind(getLJForceAtoms, _1, _2, _3, C6, C12);
    auto getCoulombForceFunc = std::bind(getCoulombForceAtoms, _1, _2, _3, kQQs);

    getFF(1, Rs_, getLJForceFunc);
    getFF(1, Rs_, getCoulombForceFunc);

    return 0;
}

输出预期的：

getLJForceAtoms
getCoulombForceAtoms

更新——关于性能

虽然担心使用 std::function 与模板的性能是很自然的，但如果没有首先进行基准测试和分析，我不会省略可能的解决方案。

我无法直接比较性能，因为我需要您的完整源代码和输入数据集来进行准确的基准测试，但我可以做一个非常简单的测试来向您展示它的外观。如果我让力函数做一点工作：

double getLJForceAtoms (int x, int y, double* r1, double* r2, double* r3)
{
    return cos(log2(abs(sin(log(pow(x, 2) + pow(y, 2))))));
}

然后有一个非常简单的getFF() 函数调用它们 1000 万次，我可以粗略比较各种设计方法（在 VS2013 上完成的测试、发布构建、快速优化标志）：

直接调用 = 1900 毫秒
切换 = 1900 毫秒
如果（标志） = 1900 毫秒
虚拟函数 = 2400 毫秒
std::function = 2400 毫秒

所以std::function 方法慢了大约 25%在这种情况下，但 switch 和 if 方法的速度与直接调用情况相同。根据您的实际力函数所做的工作量，您可能会得到更差或更好的结果。如今，编译器优化器和 CPU 分支预测器已经足够好，可以做很多可能令人惊讶甚至违反直觉的事情，这就是为什么必须进行实际测试的原因。

我会用您的确切代码和数据集进行类似的基准测试，看看各种设计有什么区别（如果有的话）。如果您的问题中确实只有两种情况，那么“if（flag）”方法可能是一个不错的选择。

【讨论】：

谢谢。你对性能有所了解吗？因为很好地考虑模板是它们在编译时被解析，所以编译器可以做很多优化（例如当我的内部函数 getLJForceAtoms 和 getCoulombForceAtoms 是短内联函数时）。这会创建一些 warper 对象，这意味着几个指针取消引用，我猜这对于性能关键函数 ( ? ) 来说并不理想。我宁愿考虑使用某种宏。
是的，这是一个很好的观点，我的解决方案使用运行时查找，而模板解决方案将使用编译时查找，因此肯定会有一些性能差异，但可能没有你想象的那么大。这在很大程度上取决于所有各种代码的快/慢，尤其是getLJForceAtoms/getCoulombForceAtoms。我建议对几种解决方案进行分析/基准测试，看看是否有任何显着差异。