【问题标题】:Performance of the code for Convolution in spatial domain空间域卷积代码的性能
【发布时间】:2018-08-27 10:10:12
【问题描述】:

见:Image convolution in spatial domain

下面的代码实现了空间域的线性卷积。

    public static double[,] ConvolutionSpatial(double[,] paddedImage, double[,] mask, double offset)
    {
        double min = 0.0;
        double max = 1.0;

        double factor = GetFactor(mask);

        int paddedImageWidth = paddedImage.GetLength(0);
        int paddedImageHeight = paddedImage.GetLength(1);

        int maskWidth = mask.GetLength(0);
        int maskHeight = mask.GetLength(1);

        int imageWidth = paddedImageWidth - maskWidth;
        int imageHeight = paddedImageHeight - maskHeight;

        double[,] convolve = new double[imageWidth, imageHeight];

        for (int y = 0; y < imageHeight; y++)
        {
            for (int x = 0; x < imageWidth; x++)
            {
                double sum = Sum(paddedImage, mask, x, y);

                convolve[x, y] = Math.Min(Math.Max((sum / factor) + offset, min), max);

                string str = string.Empty;
            }
        }

        return convolve;
    }

    public static double Sum(double[,] paddedImage1, double[,] mask1, int startX, int startY)
    {
        double sum = 0;

        int maskWidth = mask1.GetLength(0);
        int maskHeight = mask1.GetLength(1);

        for (int y = startY; y < (startY + maskHeight); y++)
        {
            for (int x = startX; x < (startX + maskWidth); x++)
            {
                double img = paddedImage1[x, y];
                double msk = mask1[maskWidth - x + startX - 1, maskHeight - y + startY - 1];
                sum = sum + (img * msk);
            }
        }

        return sum;
    }

    public static double GetFactor(double[,] kernel)
    {
        double sum = 0.0;

        int width = kernel.GetLength(0);
        int height = kernel.GetLength(1);

        for (int y = 0; y < height; y++)
        {
            for (int x = 0; x < width; x++)
            {
                sum += kernel[x, y];
            }
        }

        return (sum == 0) ? 1 : sum;
    }

性能如下:

 image-size     kernel-size   time-elapsed  
 ------------------------------------------  
 100x100        3x3                  13ms  
 512x512        3x3                 291ms
1018x1280       3x3                1687ms  
 100x100      100x100              4983ms  
 512x512      512x512          35624394ms  
1018x1280    1018x1280      [practically unusable]  

我有两个问题:

  1. 看起来像下降表演吗?
  2. 如果不是,我该如何提高性能?

【问题讨论】:

  • 您将需要使用傅立叶变换,that answer 可能会有所帮助。哦,这是你的问题,所以,你是在正确的方式。

标签: c# performance optimization


【解决方案1】:
  1. 这取决于您的最终要求。
  2. 显而易见的事情是:用锯齿数组[][]替换多维数组[,]并颠倒嵌套循环的顺序:

    for (int x = 0; ...; x++)
    {
        for (int y = 0; ...; y++)
        {
           ...
        }
    }
    

    而不是

    for (int y = 0; ...; y++)
    {
        for (int x = 0; ...; x++)
        {
           ...
        }
    }
    

在第一种情况下,CPU 缓存行的使用效率更高(因为左侧索引表示连续行),而后一种情况在每次迭代时都会使缓存行无效。

因此,具有锯齿状数组和反向循环的代码基准测试显示,与原始实现相比,1018x1280 3x3 卷积的性能提高了 2 倍:

BenchmarkDotNet=v0.11.1, OS=Windows 10.0.17134.167 (1803/April2018Update/Redstone4)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
Frequency=3515624 Hz, Resolution=284.4445 ns, Timer=TSC
  [Host]    : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0
RyuJitX64 : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0

Job=RyuJitX64  Jit=RyuJit  Platform=X64  

       Method |     Mean |     Error |    StdDev |
------------- |---------:|----------:|----------:|
 BenchmarkOld | 61.82 ms | 0.3979 ms | 0.3527 ms |
 BenchmarkNew | 26.98 ms | 0.1050 ms | 0.0982 ms |

代码如下:

    public static double[][] ConvolutionSpatial(double[][] paddedImage, double[][] mask, double offset)
    {
        double min = 0.0;
        double max = 1.0;

        double factor = GetFactor(mask);

        int paddedImageWidth = paddedImage.Length;
        int paddedImageHeight = paddedImage[0].Length;

        int maskWidth = mask.Length;
        int maskHeight = mask[0].Length;

        int imageWidth = paddedImageWidth - maskWidth;
        int imageHeight = paddedImageHeight - maskHeight;

        double[][] convolve = new double[imageWidth][];

        for (int x = 0; x < imageWidth; x++)
        {
            convolve[x] = new double[imageHeight];
            for (int y = 0; y < imageHeight; y++)
            {
                double sum = Sum(paddedImage, mask, x, y);
                convolve[x][y] = Math.Min(Math.Max((sum / factor) + offset, min), max);
                string str = string.Empty;
            }
        }

        return convolve;
    }

    public static double Sum(double[][] paddedImage1, double[][] mask1, int startX, int startY)
    {
        double sum = 0;

        int maskWidth =  mask1.Length;

        for (int x = startX; x < (startX + maskWidth); x++)
        {
            var maskHeight = mask1[maskWidth - x + startX - 1].Length;
            for (int y = startY; y < (startY + maskHeight); y++)
            {
                double img = paddedImage1[x][y];
                double msk = mask1[maskWidth - x + startX - 1][maskHeight - y + startY - 1];
                sum = sum + (img * msk);
            }
        }

        return sum;
    }

    public static double GetFactor(double[][] kernel)
    {
        double sum = 0.0;

        int width = kernel.Length;

        for (int x = 0; x < width; x++)
        {
            var height = kernel[x].Length;
            for (int y = 0; y < height; y++)
            {
                sum += kernel[x][y];
            }
        }

        return (sum == 0) ? 1 : sum;
    }

而且我认为它可以通过 SIMD 操作的应用得到更多改进。

【讨论】:

  • 1. It depends on your final requirements. --- 我只有一个要求:让代码运行得更快。
  • @anonymous 这肯定有空间 - 请参阅上面修改的结果。但是提高性能是一个无限大的课题。
  • improving performance is quite infinite subject. --- 我同意,我知道这一点。
猜你喜欢
  • 1970-01-01
  • 2017-12-26
  • 1970-01-01
  • 2011-12-17
  • 2019-02-21
  • 2015-01-09
  • 2013-08-25
  • 1970-01-01
  • 2015-11-26
相关资源
最近更新 更多