【问题标题】:Census transform in python openCVpython openCV中的人口普查变换
【发布时间】:2018-12-03 00:52:24
【问题描述】:

我开始在与立体视觉相关的项目中使用 openCV 和 python。我发现this page of documentation about the Census Transform in C++ with openCV。 link

有人知道python实现是否有类似的功能吗?

(例如 cv2.nameofthefunction)

谢谢大家!

编辑:PM 2Ring 的出色解决方案(再次感谢您)可以与 openCV 一起使用,只需稍作改动:而不是使用 Image.open

img = cv2.imread(img.png)
#some minor changes I needed like select some ROI and store them in img2[j] 
#then a for cycle in which I wrote
src_img = img2[j] 
h, w = src_img.shape

与“size”命令相比,“shape”指令似乎切换了 w 和 h 的顺序。然后我粘贴了 PM 2Ring 的其余代码,效果非常好

【问题讨论】:

    标签: python opencv


    【解决方案1】:

    我不使用 openCV,我不知道是否存在 Census Transform 的现有实现。但是,使用 Numpy 很容易实现。

    这是一个简单的演示,它使用PIL 处理加载图像并将数组数据转换回图像。

    #!/usr/bin/env python
    
    ''' The Census Transform
    
        Scan an 8 bit greyscale image with a 3x3 window
        At each scan position create an 8 bit number by comparing the value
        of the centre pixel in the 3x3 window with that of its 8 neighbours.
        The bit is set to 1 if the outer pixel >= the centre pixel
    
        See http://stackoverflow.com/questions/38265364/census-transform-in-python-opencv
    
        Written by PM 2Ring 2016.07.09
    '''
    
    import numpy as np
    from PIL import Image
    
    iname = 'Glasses0S.png'
    oname = 'Glasses0S_census.png'
    
    #Get the source image
    src_img = Image.open(iname)
    src_img.show()
    
    w, h = src_img.size
    print('image size: %d x %d = %d' % (w, h, w * h))
    print('image mode:', src_img.mode)
    
    #Convert image to Numpy array
    src_bytes = np.asarray(src_img)
    
    #Initialize output array
    census = np.zeros((h-2, w-2), dtype='uint8')
    
    #centre pixels, which are offset by (1, 1)
    cp = src_bytes[1:h-1, 1:w-1]
    
    #offsets of non-central pixels 
    offsets = [(u, v) for v in range(3) for u in range(3) if not u == 1 == v]
    
    #Do the pixel comparisons
    for u,v in offsets:
        census = (census << 1) | (src_bytes[v:v+h-2, u:u+w-2] >= cp)
    
    #Convert transformed data to image
    out_img = Image.fromarray(census)
    out_img.show()
    out_img.save(oname)
    

    来源

    输出

    原始的全彩眼镜图像由 Gilles Tran 使用 POV-Ray 创建,属于公共领域。可以在Wikipedia找到。

    【讨论】:

    • 尊敬的 PM,我对 Python 和 Stack Overflow 都不是很熟悉...答案已接受!我做了一些测试(因为我必须在脚本的其余部分使用 opencv,并且我找到了包含您的解决方案的方法)
    • @marcoresk 太好了!我刚刚做了一个快速的谷歌搜索,发现 OpenCV 有一些汉明距离函数,所以应该很容易得到立体对的人口普查变换之间的汉明距离。当然,使用 Numpy 也很容易做到这一点。
    • 这将是我的下一次(认真)尝试,但即使使用来自 openCV 的立体算法 (SGBM),它也能正常工作,就好像人口普查图像是真正的立体对一样!
    【解决方案2】:

    使用 numpy 和 OpenCV 的 Python 3 代码。添加了处理不同窗口大小并在立体图像之间生成成本差异的能力。下面是Middlebury 2014 stereo image set(Playroom-perfect)上显示的转换示例。

    import numpy as np
    import cv2
    
    def transform(image, window_size=3):
        """
        Take a gray scale image and for each pixel around the center of the window generate a bit value of length
        window_size * 2 - 1. window_size of 3 produces bit length of 8, and 5 produces 24.
    
        The image gets border of zero padded pixels half the window size.
    
        Bits are set to one if pixel under consideration is greater than the center, otherwise zero.
    
        :param image: numpy.ndarray(shape=(MxN), dtype=numpy.uint8)
        :param window_size: int odd-valued
        :return: numpy.ndarray(shape=(MxN), , dtype=numpy.uint8)
        >>> image = np.array([ [50, 70, 80], [90, 100, 110], [60, 120, 150] ])
        >>> np.binary_repr(transform(image)[0, 0])
        '1011'
        >>> image = np.array([ [60, 75, 85], [115, 110, 105], [70, 130, 170] ])
        >>> np.binary_repr(transform(image)[0, 0])
        '10011'
        """
        half_window_size = window_size // 2
    
        image = cv2.copyMakeBorder(image, top=half_window_size, left=half_window_size, right=half_window_size, bottom=half_window_size, borderType=cv2.BORDER_CONSTANT, value=0)
        rows, cols = image.shape
        census = np.zeros((rows - half_window_size * 2, cols - half_window_size * 2), dtype=np.uint8)
        center_pixels = image[half_window_size:rows - half_window_size, half_window_size:cols - half_window_size]
    
        offsets = [(row, col) for row in range(half_window_size) for col in range(half_window_size) if not row == half_window_size + 1 == col]
        for (row, col) in offsets:
            census = (census << 1) | (image[row:row + rows - half_window_size * 2, col:col + cols - half_window_size * 2] >= center_pixels)
        return census
    
    def column_cost(left_col, right_col):
        """
        Column-wise Hamming edit distance
        Also see https://www.youtube.com/watch?v=kxsvG4sSuvA&feature=youtu.be&t=1032
        :param left: numpy.ndarray(shape(Mx1), dtype=numpy.uint)
        :param right: numpy.ndarray(shape(Mx1), dtype=numpy.uint)
        :return: numpy.ndarray(shape(Mx1), dtype=numpy.uint)
        >>> image = np.array([ [50, 70, 80], [90, 100, 110], [60, 120, 150] ])
        >>> left = transform(image)
        >>> image = np.array([ [60, 75, 85], [115, 110, 105], [70, 130, 170] ])
        >>> right = transform(image)
        >>> column_cost(left, right)[0, 0]
        2
        """
        return np.sum(np.unpackbits(np.bitwise_xor(left_col, right_col), axis=1), axis=1).reshape(left_col.shape[0], left_col.shape[1])
    
    def cost(left, right, window_size=3, disparity=0):
        """
        Compute cost difference between left and right grayscale images. Disparity value can be used to assist with evaluating stereo
        correspondence.
        :param left: numpy.ndarray(shape=(MxN), dtype=numpy.uint8)
        :param right: numpy.ndarray(shape=(MxN), dtype=numpy.uint8)
        :param window_size: int odd-valued
        :param disparity: int
        :return:
        """
        ct_left = transform(left, window_size=window_size)
        ct_right = transform(right, window_size=window_size)
        rows, cols = ct_left.shape
        C = np.full(shape=(rows, cols), fill_value=0)
        for col in range(disparity, cols):
            C[:, col] = column_cost(
                ct_left[:, col:col + 1],
                ct_right[:, col - disparity:col - disparity + 1]
            ).reshape(ct_left.shape[0])
        return C
    
    def norm(image):
        return cv2.normalize(image, dst=None, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX).astype(np.uint8)
    
    if __name__ == "__main__":
        # Image set from http://vision.middlebury.edu/stereo/data/scenes2014/
        resize_pct = 0.1
        ndisp = 330 # from calib.txt
        ndisp *= resize_pct
        # load as grayscale
        left = cv2.imread('Playroom-perfect-left.png', 0)
        right = cv2.imread('Playroom-perfect-right.png', 0)
        left = cv2.resize(left, dsize=(0,0), fx=resize_pct, fy=resize_pct)
        right = cv2.resize(right, dsize=(0, 0), fx=resize_pct, fy=resize_pct)
    
        window_size = 5
        ct_left = norm(transform(left, window_size))
        ct_right = norm(transform(right, window_size))
    
        ct_costs = []
        for exponent in range(0, 6):
            import math
            disparity = int(ndisp / math.pow(2, exponent))
            print(math.pow(2, exponent), disparity)
            ct_costs.append(norm(cost(left, right, window_size, disparity)))
    
        cv2.imshow('left/right grayscale/census', np.vstack([np.hstack([left, right]), np.hstack([ct_left, ct_right])]))
        cv2.imshow('costs', np.vstack(ct_costs))
        cv2.waitKey(0)
    

    【讨论】:

      猜你喜欢
      • 2020-04-03
      • 1970-01-01
      • 1970-01-01
      • 2018-04-01
      • 1970-01-01
      • 2021-10-10
      • 2021-03-07
      • 2021-12-13
      • 1970-01-01
      相关资源
      最近更新 更多