【问题标题】:A fast look up value in vectors MATLAB code comparion向量中的快速查找值 MATLAB 代码比较
【发布时间】:2016-09-03 11:57:46
【问题描述】:

我正在使用 MATLAB 从两个输入值 u,v 中查找两个向量 OCT_EXPOCT_LOG 中的值,输出为 val 作为条件

if (( u == 0 )||( v == 0 ))
    val = 0;
else
    val = OCT_EXP( OCT_LOG(u) + OCT_LOG(v) + 1);

我尝试了三种方式:普通方式(no_vectorized方式)、矢量化方式和mex方式。我预计 mex 方式将是最好的方式,然后是矢量化方式。但是,当我测量时间消耗时,第一种方式(无向量化方式)是最好的,向量化方式是最差的方式。我的代码中发生了什么?谢谢大家

我想考虑函数的加速,因为它会被多次调用:300.000 次

第一种方式:

function val = gfmult_no_vec( u, v )

OCT_EXP = [ 1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, 205, 135, 19, 38,...
   76, 152, 45, 90, 180, 117, 234, 201, 143, 3, 6, 12, 24, 48, 96, 192, 157,...
   39, 78, 156, 37, 74, 148, 53, 106, 212, 181, 119, 238, 193, 159, 35,...
   70, 140, 5, 10, 20, 40, 80, 160, 93, 186, 105, 210, 185, 111, 222,...
   161, 95, 190, 97, 194, 153, 47, 94, 188, 101, 202, 137, 15, 30, 60,...
   120, 240, 253, 231, 211, 187, 107, 214, 177, 127, 254, 225, 223, 163,...
   91, 182, 113, 226, 217, 175, 67, 134, 17, 34, 68, 136, 13, 26, 52,...
   104, 208, 189, 103, 206, 129, 31, 62, 124, 248, 237, 199, 147, 59,...
   118, 236, 197, 151, 51, 102, 204, 133, 23, 46, 92, 184, 109, 218,...
   169, 79, 158, 33, 66, 132, 21, 42, 84, 168, 77, 154, 41, 82, 164, 85,...
   170, 73, 146, 57, 114, 228, 213, 183, 115, 230, 209, 191, 99, 198,...
   145, 63, 126, 252, 229, 215, 179, 123, 246, 241, 255, 227, 219, 171,...
   75, 150, 49, 98, 196, 149, 55, 110, 220, 165, 87, 174, 65, 130, 25,...
   50, 100, 200, 141, 7, 14, 28, 56, 112, 224, 221, 167, 83, 166, 81,...
   162, 89, 178, 121, 242, 249, 239, 195, 155, 43, 86, 172, 69, 138, 9,...
   18, 36, 72, 144, 61, 122, 244, 245, 247, 243, 251, 235, 203, 139, 11,...
   22, 44, 88, 176, 125, 250, 233, 207, 131, 27, 54, 108, 216, 173, 71,...
   142, 1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, 205, 135, 19, 38,...
   76, 152, 45, 90, 180, 117, 234, 201, 143, 3, 6, 12, 24, 48, 96, 192,...
   157, 39, 78, 156, 37, 74, 148, 53, 106, 212, 181, 119, 238, 193, 159,...
   35, 70, 140, 5, 10, 20, 40, 80, 160, 93, 186, 105, 210, 185, 111,...
   222, 161, 95, 190, 97, 194, 153, 47, 94, 188, 101, 202, 137, 15, 30,...
   60, 120, 240, 253, 231, 211, 187, 107, 214, 177, 127, 254, 225, 223,...
   163, 91, 182, 113, 226, 217, 175, 67, 134, 17, 34, 68, 136, 13, 26,...
   52, 104, 208, 189, 103, 206, 129, 31, 62, 124, 248, 237, 199, 147,...
   59, 118, 236, 197, 151, 51, 102, 204, 133, 23, 46, 92, 184, 109, 218,...
   169, 79, 158, 33, 66, 132, 21, 42, 84, 168, 77, 154, 41, 82, 164, 85,...
   170, 73, 146, 57, 114, 228, 213, 183, 115, 230, 209, 191, 99, 198,...
   145, 63, 126, 252, 229, 215, 179, 123, 246, 241, 255, 227, 219, 171,...
   75, 150, 49, 98, 196, 149, 55, 110, 220, 165, 87, 174, 65, 130, 25,...
   50, 100, 200, 141, 7, 14, 28, 56, 112, 224, 221, 167, 83, 166, 81,...
   162, 89, 178, 121, 242, 249, 239, 195, 155, 43, 86, 172, 69, 138, 9,...
   18, 36, 72, 144, 61, 122, 244, 245, 247, 243, 251, 235, 203, 139, 11,...
   22, 44, 88, 176, 125, 250, 233, 207, 131, 27, 54, 108, 216, 173, 71,...
   142 ];

OCT_LOG = [ 0, 1, 25, 2, 50, 26, 198, 3, 223, 51, 238, 27, 104, 199, 75, 4,...
   100, 224, 14, 52, 141, 239, 129, 28, 193, 105, 248, 200, 8, 76, 113, 5,...
   138, 101, 47, 225, 36, 15, 33, 53, 147, 142, 218, 240, 18, 130, 69,...
   29, 181, 194, 125, 106, 39, 249, 185, 201, 154, 9, 120, 77, 228, 114,... end

   166, 6, 191, 139, 98, 102, 221, 48, 253, 226, 152, 37, 179, 16, 145,...
   34, 136, 54, 208, 148, 206, 143, 150, 219, 189, 241, 210, 19, 92,...
   131, 56, 70, 64, 30, 66, 182, 163, 195, 72, 126, 110, 107, 58, 40,...
   84, 250, 133, 186, 61, 202, 94, 155, 159, 10, 21, 121, 43, 78, 212,...
   229, 172, 115, 243, 167, 87, 7, 112, 192, 247, 140, 128, 99, 13, 103,...
   74, 222, 237, 49, 197, 254, 24, 227, 165, 153, 119, 38, 184, 180,...
   124, 17, 68, 146, 217, 35, 32, 137, 46, 55, 63, 209, 91, 149, 188,...
   207, 205, 144, 135, 151, 178, 220, 252, 190, 97, 242, 86, 211, 171,...
   20, 42, 93, 158, 132, 60, 57, 83, 71, 109, 65, 162, 31, 45, 67, 216,...
   183, 123, 164, 118, 196, 23, 73, 236, 127, 12, 111, 246, 108, 161,...
   59, 82, 41, 157, 85, 170, 251, 96, 134, 177, 187, 204, 62, 90, 203,...
   89, 95, 176, 156, 169, 160, 81, 11, 245, 22, 235, 122, 117, 44, 215,...
   79, 174, 213, 233, 230, 231, 173, 232, 116, 214, 244, 234, 168, 80,...
   88, 175 ];

    if (( u == 0 )||( v == 0 ))
        val = 0;
    else
        val = OCT_EXP( OCT_LOG(u) + OCT_LOG(v) + 1);

第二种方式:向量化方式

function val = gfmult_vec( u, v )

OCT_EXP = [ 1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, 205, 135, 19, 38,...
   76, 152, 45, 90, 180, 117, 234, 201, 143, 3, 6, 12, 24, 48, 96, 192, 157,...
   39, 78, 156, 37, 74, 148, 53, 106, 212, 181, 119, 238, 193, 159, 35,...
   70, 140, 5, 10, 20, 40, 80, 160, 93, 186, 105, 210, 185, 111, 222,...
   161, 95, 190, 97, 194, 153, 47, 94, 188, 101, 202, 137, 15, 30, 60,...
   120, 240, 253, 231, 211, 187, 107, 214, 177, 127, 254, 225, 223, 163,...
   91, 182, 113, 226, 217, 175, 67, 134, 17, 34, 68, 136, 13, 26, 52,...
   104, 208, 189, 103, 206, 129, 31, 62, 124, 248, 237, 199, 147, 59,...
   118, 236, 197, 151, 51, 102, 204, 133, 23, 46, 92, 184, 109, 218,...
   169, 79, 158, 33, 66, 132, 21, 42, 84, 168, 77, 154, 41, 82, 164, 85,...
   170, 73, 146, 57, 114, 228, 213, 183, 115, 230, 209, 191, 99, 198,...
   145, 63, 126, 252, 229, 215, 179, 123, 246, 241, 255, 227, 219, 171,...
   75, 150, 49, 98, 196, 149, 55, 110, 220, 165, 87, 174, 65, 130, 25,...
   50, 100, 200, 141, 7, 14, 28, 56, 112, 224, 221, 167, 83, 166, 81,...
   162, 89, 178, 121, 242, 249, 239, 195, 155, 43, 86, 172, 69, 138, 9,...
   18, 36, 72, 144, 61, 122, 244, 245, 247, 243, 251, 235, 203, 139, 11,...
   22, 44, 88, 176, 125, 250, 233, 207, 131, 27, 54, 108, 216, 173, 71,...
   142, 1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, 205, 135, 19, 38,...
   76, 152, 45, 90, 180, 117, 234, 201, 143, 3, 6, 12, 24, 48, 96, 192,...
   157, 39, 78, 156, 37, 74, 148, 53, 106, 212, 181, 119, 238, 193, 159,...
   35, 70, 140, 5, 10, 20, 40, 80, 160, 93, 186, 105, 210, 185, 111,...
   222, 161, 95, 190, 97, 194, 153, 47, 94, 188, 101, 202, 137, 15, 30,...
   60, 120, 240, 253, 231, 211, 187, 107, 214, 177, 127, 254, 225, 223,...
   163, 91, 182, 113, 226, 217, 175, 67, 134, 17, 34, 68, 136, 13, 26,...
   52, 104, 208, 189, 103, 206, 129, 31, 62, 124, 248, 237, 199, 147,...
   59, 118, 236, 197, 151, 51, 102, 204, 133, 23, 46, 92, 184, 109, 218,...
   169, 79, 158, 33, 66, 132, 21, 42, 84, 168, 77, 154, 41, 82, 164, 85,...
   170, 73, 146, 57, 114, 228, 213, 183, 115, 230, 209, 191, 99, 198,...
   145, 63, 126, 252, 229, 215, 179, 123, 246, 241, 255, 227, 219, 171,...
   75, 150, 49, 98, 196, 149, 55, 110, 220, 165, 87, 174, 65, 130, 25,...
   50, 100, 200, 141, 7, 14, 28, 56, 112, 224, 221, 167, 83, 166, 81,...
   162, 89, 178, 121, 242, 249, 239, 195, 155, 43, 86, 172, 69, 138, 9,...
   18, 36, 72, 144, 61, 122, 244, 245, 247, 243, 251, 235, 203, 139, 11,...
   22, 44, 88, 176, 125, 250, 233, 207, 131, 27, 54, 108, 216, 173, 71,...
   142 ];

OCT_LOG = [ 0, 1, 25, 2, 50, 26, 198, 3, 223, 51, 238, 27, 104, 199, 75, 4,...
   100, 224, 14, 52, 141, 239, 129, 28, 193, 105, 248, 200, 8, 76, 113, 5,...
   138, 101, 47, 225, 36, 15, 33, 53, 147, 142, 218, 240, 18, 130, 69,...
   29, 181, 194, 125, 106, 39, 249, 185, 201, 154, 9, 120, 77, 228, 114,... 
   166, 6, 191, 139, 98, 102, 221, 48, 253, 226, 152, 37, 179, 16, 145,...
   34, 136, 54, 208, 148, 206, 143, 150, 219, 189, 241, 210, 19, 92,...
   131, 56, 70, 64, 30, 66, 182, 163, 195, 72, 126, 110, 107, 58, 40,...
   84, 250, 133, 186, 61, 202, 94, 155, 159, 10, 21, 121, 43, 78, 212,...
   229, 172, 115, 243, 167, 87, 7, 112, 192, 247, 140, 128, 99, 13, 103,...
   74, 222, 237, 49, 197, 254, 24, 227, 165, 153, 119, 38, 184, 180,...
   124, 17, 68, 146, 217, 35, 32, 137, 46, 55, 63, 209, 91, 149, 188,...
   207, 205, 144, 135, 151, 178, 220, 252, 190, 97, 242, 86, 211, 171,...
   20, 42, 93, 158, 132, 60, 57, 83, 71, 109, 65, 162, 31, 45, 67, 216,...
   183, 123, 164, 118, 196, 23, 73, 236, 127, 12, 111, 246, 108, 161,...
   59, 82, 41, 157, 85, 170, 251, 96, 134, 177, 187, 204, 62, 90, 203,...
   89, 95, 176, 156, 169, 160, 81, 11, 245, 22, 235, 122, 117, 44, 215,...
   79, 174, 213, 233, 230, 231, 173, 232, 116, 214, 244, 234, 168, 80,...
   88, 175 ];

    uv0 =  (~(( u == 0 )|( v == 0 )));
    val = zeros(size(u));
    val(uv0) = OCT_EXP( OCT_LOG(u(uv0)) + OCT_LOG(v(uv0)) + 1);
end

最后一种方式:墨西哥码

#include "mex.h"
double look_up(double u, double v)
{
   double OCT_EXP [510] = { 1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, 205, 135, 19, 38,
   76, 152, 45, 90, 180, 117, 234, 201, 143, 3, 6, 12, 24, 48, 96, 192, 157,
   39, 78, 156, 37, 74, 148, 53, 106, 212, 181, 119, 238, 193, 159, 35,
   70, 140, 5, 10, 20, 40, 80, 160, 93, 186, 105, 210, 185, 111, 222,
   161, 95, 190, 97, 194, 153, 47, 94, 188, 101, 202, 137, 15, 30, 60,
   120, 240, 253, 231, 211, 187, 107, 214, 177, 127, 254, 225, 223, 163,
   91, 182, 113, 226, 217, 175, 67, 134, 17, 34, 68, 136, 13, 26, 52,
   104, 208, 189, 103, 206, 129, 31, 62, 124, 248, 237, 199, 147, 59,
   118, 236, 197, 151, 51, 102, 204, 133, 23, 46, 92, 184, 109, 218,
   169, 79, 158, 33, 66, 132, 21, 42, 84, 168, 77, 154, 41, 82, 164, 85,
   170, 73, 146, 57, 114, 228, 213, 183, 115, 230, 209, 191, 99, 198,
   145, 63, 126, 252, 229, 215, 179, 123, 246, 241, 255, 227, 219, 171,
   75, 150, 49, 98, 196, 149, 55, 110, 220, 165, 87, 174, 65, 130, 25,
   50, 100, 200, 141, 7, 14, 28, 56, 112, 224, 221, 167, 83, 166, 81,
   162, 89, 178, 121, 242, 249, 239, 195, 155, 43, 86, 172, 69, 138, 9,
   18, 36, 72, 144, 61, 122, 244, 245, 247, 243, 251, 235, 203, 139, 11,
   22, 44, 88, 176, 125, 250, 233, 207, 131, 27, 54, 108, 216, 173, 71,
   142, 1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, 205, 135, 19, 38,
   76, 152, 45, 90, 180, 117, 234, 201, 143, 3, 6, 12, 24, 48, 96, 192,
   157, 39, 78, 156, 37, 74, 148, 53, 106, 212, 181, 119, 238, 193, 159,
   35, 70, 140, 5, 10, 20, 40, 80, 160, 93, 186, 105, 210, 185, 111,
   222, 161, 95, 190, 97, 194, 153, 47, 94, 188, 101, 202, 137, 15, 30,
   60, 120, 240, 253, 231, 211, 187, 107, 214, 177, 127, 254, 225, 223,
   163, 91, 182, 113, 226, 217, 175, 67, 134, 17, 34, 68, 136, 13, 26,
   52, 104, 208, 189, 103, 206, 129, 31, 62, 124, 248, 237, 199, 147,
   59, 118, 236, 197, 151, 51, 102, 204, 133, 23, 46, 92, 184, 109, 218,
   169, 79, 158, 33, 66, 132, 21, 42, 84, 168, 77, 154, 41, 82, 164, 85,
   170, 73, 146, 57, 114, 228, 213, 183, 115, 230, 209, 191, 99, 198,
   145, 63, 126, 252, 229, 215, 179, 123, 246, 241, 255, 227, 219, 171,
   75, 150, 49, 98, 196, 149, 55, 110, 220, 165, 87, 174, 65, 130, 25,
   50, 100, 200, 141, 7, 14, 28, 56, 112, 224, 221, 167, 83, 166, 81,
   162, 89, 178, 121, 242, 249, 239, 195, 155, 43, 86, 172, 69, 138, 9,
   18, 36, 72, 144, 61, 122, 244, 245, 247, 243, 251, 235, 203, 139, 11,
   22, 44, 88, 176, 125, 250, 233, 207, 131, 27, 54, 108, 216, 173, 71,
   142 };

   double OCT_LOG[255] = { 0, 1, 25, 2, 50, 26, 198, 3, 223, 51, 238, 27, 104, 199, 75, 4,
   100, 224, 14, 52, 141, 239, 129, 28, 193, 105, 248, 200, 8, 76, 113, 5,
   138, 101, 47, 225, 36, 15, 33, 53, 147, 142, 218, 240, 18, 130, 69,
   29, 181, 194, 125, 106, 39, 249, 185, 201, 154, 9, 120, 77, 228, 114, 
   166, 6, 191, 139, 98, 102, 221, 48, 253, 226, 152, 37, 179, 16, 145,
   34, 136, 54, 208, 148, 206, 143, 150, 219, 189, 241, 210, 19, 92,
   131, 56, 70, 64, 30, 66, 182, 163, 195, 72, 126, 110, 107, 58, 40,
   84, 250, 133, 186, 61, 202, 94, 155, 159, 10, 21, 121, 43, 78, 212,
   229, 172, 115, 243, 167, 87, 7, 112, 192, 247, 140, 128, 99, 13, 103,
   74, 222, 237, 49, 197, 254, 24, 227, 165, 153, 119, 38, 184, 180,
   124, 17, 68, 146, 217, 35, 32, 137, 46, 55, 63, 209, 91, 149, 188,
   207, 205, 144, 135, 151, 178, 220, 252, 190, 97, 242, 86, 211, 171,
   20, 42, 93, 158, 132, 60, 57, 83, 71, 109, 65, 162, 31, 45, 67, 216,
   183, 123, 164, 118, 196, 23, 73, 236, 127, 12, 111, 246, 108, 161,
   59, 82, 41, 157, 85, 170, 251, 96, 134, 177, 187, 204, 62, 90, 203,
   89, 95, 176, 156, 169, 160, 81, 11, 245, 22, 235, 122, 117, 44, 215,
   79, 174, 213, 233, 230, 231, 173, 232, 116, 214, 244, 234, 168, 80,
   88, 175 };

    if (( u == 0 )||( v == 0 ))
        return 0;
    else
    {
        int index=OCT_LOG[int(u-1)] + OCT_LOG[int(v-1)] + 1;
        return OCT_EXP [index-1];
    }

}


void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    double *u, *v, *uv;
    int mrows, ncols;
    plhs[0] = mxCreateDoubleMatrix(1,1, mxREAL);
    /* Assign pointers to each input and output. */
     u = mxGetPr(prhs[0]);    
     v = mxGetPr(prhs[1]);
     uv = mxGetPr(plhs[0]);
    *uv = look_up(*u, *v);
}

测量代码

function main
    function test1()
        for i=1:200
            for j=1:200
                gfmult_no_vec(i,j);
            end
        end
    end

    function test2()
        for i=1:200
            for j=1:200
                gfmult_vec(i,j);
            end
        end
    end

    function test3()
        for i=1:200
            for j=1:200
                gfmult_mex(i,j);
            end
        end
    end
f1=@()test1();
t1=timeit(f1)

f2=@()test2();
t2=timeit(f2)

f3=@()test3();
t3=timeit(f3)
end

报告时间t1 = 0.1934, t2 = 1.1739, t3 = 0.3584

【问题讨论】:

  • 你不应该期望 mex 是最好的,完全矢量化的方式总是更快。但是,在您的代码中,没有一个方法是完全矢量化的,因为您在所有这些方法中都使用了循环。
  • 这些都不是矢量化的。该函数使用 2 个标量作为参数调用,这意味着 uv 将是标量。结果就是你在第二个等式中做了很多不必要的工作。
  • 另外,uv 永远不会是零(来自 1:200)那你为什么要测试呢?
  • 谢谢大家。我该如何改进我的代码?
  • 我只是用for来测试,因为这个函数会被调用很多次。我们可以通过随机数来控制零,但没关系

标签: performance matlab vectorization mex


【解决方案1】:

此解决方案在我的计算机中需要 0.0006 秒(并且是矢量化的):

u = randi(5,200,1)-1; % some arbitrary data including zeros
v = randi(5,200,1)-1; % some arbitrary data including zeros
[U,V] = ndgrid(u(u~=0),v(v~=0)); % make all possible combinations of u and v
val = zeros(length(u),length(v)); % initialize the output size, in case the last value in u or v is zero.
f = @(u,v) OCT_EXP(OCT_LOG(u)+OCT_LOG(v)+1);
val((u~=0),(v~=0)) = f(U,V);

如果uv 都不是零,则现在val(u,v) = OCT_EXP(OCT_LOG(u)+OCT_LOG(v)+1),否则val(u,v) = 0


如果您希望gfmult 具有标量输入,那么您的第一种方法似乎是最快的方法。但是,我会在函数外部定义 OCT_EXPOCT_LOG 并将它们传递给它,而不是一遍又一遍地分配这个值:

function val = gfmult(OCT_EXP,OCT_LOG,u,v)
if (u==0)||(v==0)
    val = 0;
else
    val = OCT_EXP(OCT_LOG(u)+OCT_LOG(v)+1);
end
end

在我的计算机中,它会将运行时间从 0.21444(使用您的版本)减少到 0.158 秒,以进行 100K 迭代,这并不是一个很大的改进(0.05644 秒),但如果你有数百万个,它可能会很重要。

【讨论】:

  • 感谢 EBH,其实我只想改进`gfmult` 函数中的代码以获得两个刻度值u,v。我使用两个 for 循环来测试时间。您的代码看起来使gfmult 函数的输入被矢量化。
  • 我不确定我明白你的意思,如果uv 都是标量,那么就没有什么可以向量化的了......但是,如果你要在许多@上运行这个输入987654334@'s 和v's,那为什么要把它们当作标量呢?
  • 对不起,我说的矢量化词可能是错的。让我们只考虑函数gfmult。我会多次调用该函数(~100000 次),所以我使用两个循环来表达它。如果我们可以在一次迭代中对函数内部进行一点改进,那么在 100000 次迭代之后我们会取得如此大的改进。我只想在函数gfmult内部进行改进,因为它在不同的位置调用,我无法将这些位置全部更改为矢量化方式,因此,我想保持函数的输入和输出,并在它内部进行改进。有没有可能。
  • @user3051460 我认为您已经触底,但请查看我的编辑以了解最后一点。
  • 使用persistent OCT_EXP OCT_LOG 使这个函数比仅仅通过OCT_EXPOCT_LOG 慢x9。最快的解决方案是将它们传递到他们需要的任何地方,如果你想让事情更整洁,你可以将它们组合在一个结构中(比如:OCT.EXP = OCT_EXP; OCT.LOG = OCT_LOG),所以你只需要传递OCT
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2018-05-17
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多