如何从这个视频序列中去除噪音？答案

【问题标题】：How can I remove noise from this video sequence?如何从这个视频序列中去除噪音？
【发布时间】：2012-08-22 22:32:57
【问题描述】：

您好，我正在尝试进行一些图像处理。我使用 Microsoft Kinect 来检测房间里的人。我得到深度数据，做一些背景减法工作，当一个人进入场景并四处走动时，我得到了这样的视频序列：

http://www.screenr.com/h7f8

我放了一个视频，以便您可以看到视频中噪音的行为。不同的颜色代表不同的深度。白色代表空。如您所见，它非常嘈杂，尤其是红噪声。

我需要尽可能地摆脱除人类之外的一切。当我进行腐蚀/膨胀（使用非常大的窗口大小）时，我可以消除很多噪音，但我想知道是否还有其他方法可以使用。尤其是视频中的红噪声很难通过腐蚀/膨胀去除。

一些注意事项：

1) 如果我们知道什么时候场景中没有人，则可以进行更好的背景减除，但是我们所做的背景减除是全自动的，即使场景中有人类，甚至当相机移动时它也可以工作等等，所以这是我们现在能得到的最好的背景减法。

2) 该算法将在嵌入式系统上实时运行。所以算法越高效、越简单越好。它不一定是完美的。虽然也欢迎复杂的信号处理技术（也许我们可以在另一个不需要嵌入式实时处理的项目中使用它们）。

3) 我不需要实际的代码。只是想法。

【问题讨论】：

了解更多关于背景减法的知识可能会有所帮助；即为什么图像中会留下噪点？
您使用的是什么 SDK/驱动程序（例如 MS Kinect SDK、OpenNI、libfreenect 等）？

标签： image-processing signal-processing kinect noise

【解决方案1】：

假设您使用的是 Kinect SDK，这非常简单。我会关注 this 视频了解深度基础知识，然后执行以下操作：

    private byte[] GenerateColoredBytes(DepthImageFrame depthFrame)
    {

        //get the raw data from kinect with the depth for every pixel
        short[] rawDepthData = new short[depthFrame.PixelDataLength];
        depthFrame.CopyPixelDataTo(rawDepthData); 

        //use depthFrame to create the image to display on-screen
        //depthFrame contains color information for all pixels in image
        //Height x Width x 4 (Red, Green, Blue, empty byte)
        Byte[] pixels = new byte[depthFrame.Height * depthFrame.Width * 4];

        //Bgr32  - Blue, Green, Red, empty byte
        //Bgra32 - Blue, Green, Red, transparency 
        //You must set transparency for Bgra as .NET defaults a byte to 0 = fully transparent

        //hardcoded locations to Blue, Green, Red (BGR) index positions       
        const int BlueIndex = 0;
        const int GreenIndex = 1;
        const int RedIndex = 2;


        //loop through all distances
        //pick a RGB color based on distance
        for (int depthIndex = 0, colorIndex = 0; 
            depthIndex < rawDepthData.Length && colorIndex < pixels.Length; 
            depthIndex++, colorIndex += 4)
        {
            //get the player (requires skeleton tracking enabled for values)
            int player = rawDepthData[depthIndex] & DepthImageFrame.PlayerIndexBitmask;

            //gets the depth value
            int depth = rawDepthData[depthIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;

            //.9M or 2.95'
            if (depth <= 900)
            {
                //we are very close
                pixels[colorIndex + BlueIndex] = Colors.White.B;
                pixels[colorIndex + GreenIndex] = Colors.White.G;
                pixels[colorIndex + RedIndex] = Colors.White.R;
            }
            // .9M - 2M or 2.95' - 6.56'
            else if (depth > 900 && depth < 2000)
            {
                //we are a bit further away
                pixels[colorIndex + BlueIndex] = Colors.White.B;
                pixels[colorIndex + GreenIndex] = Colors.White.G;
                pixels[colorIndex + RedIndex] = Colors.White.R;
            }
            // 2M+ or 6.56'+
            else if (depth > 2000)
            {
                //we are the farthest
                pixels[colorIndex + BlueIndex] = Colors.White.B;
                pixels[colorIndex + GreenIndex] = Colors.White.G;
                pixels[colorIndex + RedIndex] = Colors.White.R;
            }


            ////equal coloring for monochromatic histogram
            //byte intensity = CalculateIntensityFromDepth(depth);
            //pixels[colorIndex + BlueIndex] = intensity;
            //pixels[colorIndex + GreenIndex] = intensity;
            //pixels[colorIndex + RedIndex] = intensity;


            //Color all players "gold"
            if (player > 0)
            {
                pixels[colorIndex + BlueIndex] = Colors.Gold.B;
                pixels[colorIndex + GreenIndex] = Colors.Gold.G;
                pixels[colorIndex + RedIndex] = Colors.Gold.R;
            }

        }


        return pixels;
    }

这会把除了人类以外的一切都变成白色，而人类则是金子。希望这会有所帮助！

编辑

我知道你不一定想要代码只是想法，所以我会说找到一种算法来找到深度，找到人类的数量，并将除人类之外的所有东西都涂成白色。我已经提供了所有这些，但我不知道你是否知道发生了什么。我还有最终程序的图像。

注意：我为透视添加了第二个深度框架

【讨论】：

【解决方案2】：

我可能是错的（我需要未经处理的视频），但我倾向于说您正试图摆脱照明变化。

这就是让“真实”环境中的人员检测变得非常困难的原因。

您可以查看this other SO question 以获得一些链接。

我曾经以与您相同的配置实时检测人类，但使用单目视觉。在我的例子中，一个非常好的描述符是LBPs，它主要用于纹理分类。这很容易付诸实践（网络上到处都有实现）。

LBP 基本上用于定义检测到运动的感兴趣区域，这样我就可以只处理部分图像并消除所有噪声。

本文以 LBP 为例对图像进行灰度校正。

希望能带来一些新的想法。

【讨论】：

【解决方案3】：

只要我的两分钱：

如果您不介意为此使用 SDK，那么您可以非常轻松地使用 PlayerIndexBitmask 仅保留人物像素，如 Outlaw Lemur 所示。

现在您可能不想依赖驱动程序，而是希望在图像处理级别上做到这一点。我们在一个项目中尝试过并且效果很好的一种方法是基于轮廓的。我们从背景减法开始，然后我们检测到图像中最大的轮廓，假设这是人（因为通常剩下的噪声是非常小的斑点），我们填充了那个轮廓并保留了它。您还可以使用某种中值过滤作为第一遍。

当然，这并不完美，也不适合所有情况，可能还有很多更好的方法。但我只是把它扔在那里，以防它帮助你想出任何想法。

【讨论】：

【解决方案4】：

看看eyesweb。

这是一个支持 kinect 设备的设计平台，您可以在输出上应用噪声过滤器。它是multimodal 系统设计的一个非常有用和简单的工具。

【讨论】：