如何使用 scipy 的 affine_transform 对彩色图像进行任意仿射变换？答案

【问题标题】：How can I use scipy's affine_transform to do an arbitrary affine transformation on a color image?如何使用 scipy 的 affine_transform 对彩色图像进行任意仿射变换？
【发布时间】：2017-06-21 10:55:15
【问题描述】：

我的目标是转换图像，使三个源点映射到空数组中的三个目标点。我已经找到了正确的仿射矩阵，但是我无法在彩色图像上应用仿射变换。

更具体地说，我正在努力正确使用 scipy.ndimage.interpolation.affine_transform 方法。正如question 和它的回答所指出的那样，affine_transform 方法可能有些不直观（尤其是在偏移计算方面），但是，用户 timday 展示了如何在图像上应用旋转和剪切并将其定位在另一个数组中，而用户地理数据提供了更多背景信息。

我的问题是将那里显示的方法（1）推广到彩色图像，（2）推广到我自己计算的任意变换。

这是我的代码（应该在您的计算机上按原样运行）：

import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt


def calcAffineMatrix(sourcePoints, targetPoints):
    # For three source- and three target points, find the affine transformation
    # Function works correctly, not part of the question
    A = []
    b = []
    for sp, trg in zip(sourcePoints, targetPoints):
        A.append([sp[0], 0, sp[1], 0, 1, 0])
        A.append([0, sp[0], 0, sp[1], 0, 1])
        b.append(trg[0])
        b.append(trg[1])
    result, resids, rank, s = np.linalg.lstsq(np.array(A), np.array(b))

    a0, a1, a2, a3, a4, a5 = result
    # Ignoring offset here, later use timday's suggested offset calculation
    affineTrafo = np.array([[a0, a1, 0], [a2, a3, 0], [0, 0, 1]], 'd')

    # Testing the correctness of transformation matrix
    for i, _ in enumerate(sourcePoints):
        src = sourcePoints[i]
        src.append(1.)
        trg = targetPoints[i]
        trg.append(1.)
        at = affineTrafo.copy()
        at[2, 0:2] = [a4, a5]
        assert(np.array_equal(np.round(np.array(src).dot(at)), np.array(trg)))
    return affineTrafo


# Prepare source image
sourcePoints = [[162., 112.], [130., 112.], [162., 240.]]
targetPoints = [[180., 102.], [101., 101.], [190., 200.]]
image = np.empty((300, 300, 3), dtype='uint8')
image[:] = 255
# Mark border for better visibility
image[0:2, :] = 0
image[-3:-1, :] = 0
image[:, 0:2] = 0
image[:, -3:-1] = 0
# Mark source points in red
for sp in sourcePoints:
    sp = [int(u) for u in sp]
    image[sp[1] - 5:sp[1] + 5, sp[0] - 5:sp[0] + 5, :] = np.array([255, 0, 0])

# Show image
plt.subplot(3, 1, 1)
plt.imshow(image)

# Prepare array in which the image is placed
array = np.empty((400, 300, 3), dtype='uint8')
array[:] = 255
a2 = array.copy()
# Mark target points in blue
for tp in targetPoints:
    tp = [int(u) for u in tp]
    a2[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

# Show array
plt.subplot(3, 1, 2)
plt.imshow(a2)

# Next 5 program lines are actually relevant for question:

# Calculate affine matrix
affineTrafo = calcAffineMatrix(sourcePoints, targetPoints)

# This follows the c_in-c_out method proposed in linked stackoverflow issue
# extended for color channel (no translation here)
c_in = np.array([sourcePoints[0][0], sourcePoints[0][1], 0])
c_out = np.array([targetPoints[0][0], targetPoints[0][1], 0])
offset = (c_in - np.dot(c_out, affineTrafo))

# Affine transform!
ndimage.interpolation.affine_transform(image, affineTrafo, order=2, offset=offset,
                                       output=array, output_shape=array.shape,
                                       cval=255)
# Mark blue target points in array, expected to be above red source points
for tp in targetPoints:
    tp = [int(u) for u in tp]
    array[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

plt.subplot(3, 1, 3)
plt.imshow(array)

plt.show()

我尝试过的其他方法包括使用 affineTrafo 的逆、转置或两者：

affineTrafo = np.linalg.inv(affineTrafo)
affineTrafo = affineTrafo.T
affineTrafo = np.linalg.inv(affineTrafo.T)
affineTrafo = np.linalg.inv(affineTrafo).T

在他的回答中，geodata 展示了如何计算affine_trafo 需要进行缩放和旋转的矩阵：

如果想要先缩放 S，然后再旋转 R，它会保留 T=R*S 和 T.inv=S.inv*R.inv（注意相反的顺序）。

我尝试使用矩阵分解来复制（将仿射变换分解为旋转、剪切和另一个旋转）：

u, s, v = np.linalg.svd(affineTrafo[:2,:2])
uInv = np.linalg.inv(u)
sInv = np.linalg.inv(np.diag((s)))
vInv = np.linalg.inv(v)
affineTrafo[:2, :2] = uInv.dot(sInv).dot(vInv)

再次失败。

对于我的所有结果，这不是（仅）一个偏移问题。从图片中可以明显看出源点和目标点的相对位置不对应。

我搜索了网络和 stackoverflow，但没有找到我的问题的答案。请帮我！ :)

【问题讨论】：

我的回答 here 是相关的，可能会帮助您了解这个 offset 是什么以及如何计算它。
@AlexanderReynolds 谢谢，我已阅读您的答案，但问题比偏移量早。您是否尝试运行代码？您会看到转换完全错误，而不仅仅是偏移量。蓝点和红点应该重叠，但甚至没有正确的相对定位。
是的，但我不知道发生了什么。文档非常缺乏。目前尚不清楚这些位置是通过前乘还是后乘计算的（谁知道是使用变换还是逆），何时应用偏移，或者扭曲点与目标图像的坐标有什么关系.我可以告诉您，您正在计算 c_in 和 c_out 错误，最后您将无法获得带有 0 的正确像素位置（它们应该是同质点，就像我的回答所说的那样，用 @ 未定义987654335@ 最后）。不过不是主要问题。
尽管这个问题是可以通过scipy 解决，我强烈建议使用OpenCV 来完成这样的任务。这是 OpenCV 中的两行代码；使用该程序中的变量名：affineTrafo = cv2.getAffineTransform(src_pts, trg_pts); array = cv2.warpAffine(image, affineTrafo, array.shape).
是的，对不起，伙计——我已经经历了大约一个小时，但我不知道这是如何工作的。我已经完全从头开始实施这些例程。我会手动进行变形和插值。

标签： python numpy image-processing scipy affinetransform

【解决方案1】：

感谢 AlexanderReynolds 提示使用另一个库，我终于让它工作了。这当然是一种解决方法；我无法使用 scipy 的 affine_transform 让它工作，所以我改用 OpenCVs cv2.warpAffine。如果这对其他人有帮助，这是我的代码：

import numpy as np
import matplotlib.pyplot as plt
import cv2

# Prepare source image
sourcePoints = [[162., 112.], [130., 112.], [162., 240.]]
targetPoints = [[180., 102.], [101., 101.], [190., 200.]]
image = np.empty((300, 300, 3), dtype='uint8')
image[:] = 255
# Mark border for better visibility
image[0:2, :] = 0
image[-3:-1, :] = 0
image[:, 0:2] = 0
image[:, -3:-1] = 0
# Mark source points in red
for sp in sourcePoints:
    sp = [int(u) for u in sp]
    image[sp[1] - 5:sp[1] + 5, sp[0] - 5:sp[0] + 5, :] = np.array([255, 0, 0])

# Show image
plt.subplot(3, 1, 1)
plt.imshow(image)

# Prepare array in which the image is placed
array = np.empty((400, 300, 3), dtype='uint8')
array[:] = 255
a2 = array.copy()
# Mark target points in blue
for tp in targetPoints:
    tp = [int(u) for u in tp]
    a2[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

# Show array
plt.subplot(3, 1, 2)
plt.imshow(a2)

# Calculate affine matrix and transform image
M = cv2.getAffineTransform(np.float32(sourcePoints), np.float32(targetPoints))
array = cv2.warpAffine(image, M, array.shape[:2], borderValue=[255, 255, 255])

# Mark blue target points in array, expected to be above red source points
for tp in targetPoints:
    tp = [int(u) for u in tp]
    array[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

plt.subplot(3, 1, 3)
plt.imshow(array)

plt.show()

评论：

有趣的是，它在更改库后几乎立即工作。在花了一天多的时间尝试让它与 scipy 一起工作之后，这是我更快地更改库的一个教训。
如果有人想要找到基于三个以上点的仿射变换的（最小二乘）近似值，您可以通过以下方式获得适用于 cv2.warpAffine 的矩阵：

代码：

def calcAffineMatrix(sourcePoints, targetPoints):
    # For three or more source and target points, find the affine transformation
    A = []
    b = []
    for sp, trg in zip(sourcePoints, targetPoints):
        A.append([sp[0], 0, sp[1], 0, 1, 0])
        A.append([0, sp[0], 0, sp[1], 0, 1])
        b.append(trg[0])
        b.append(trg[1])
    result, resids, rank, s = np.linalg.lstsq(np.array(A), np.array(b))

    a0, a1, a2, a3, a4, a5 = result
    affineTrafo = np.float32([[a0, a2, a4], [a1, a3, a5]])
    return affineTrafo

【讨论】：

只是提供两个快速说明，以防您不知道：在 OpenCV 中，图像颜色通道位于 BGR 中，而不是正常的 RGB --- 不是图像变形或类似问题的问题那个，但它可能会在某些时候绊倒你（例如，如果你用 OpenCV 读取图像但用Matplotlib 显示，你需要从 BGR 转换为 RGB）。您还可以在 OpenCV 中找到完整的 (3,3) 单应性，使用 cv2.getPerspectiveTransform() 或 cv2.findHomography() 如果您有四个或更多点（在所有可能的点之间找到最佳单应性）。
谢谢，很有帮助！ :) 实际上正在考虑实施 Schaefer 等人的移动最小二乘法。 2006，这应该会给我一个更逼真的图像变换。
这会很好，但与 OpenCV 内置的方法相比肯定会很慢。现在常用的方法是使用 SIFT、ORB 等生成特征匹配，然后将它们扔进findHomography，它使用 RANSAC 从所有可能的特征匹配中找到最佳可能的单应性。但无论哪种方式，将论文中的方法直接应用到代码中都很有趣。