平面拟合到 4 个（或更多）XYZ 点答案

【问题标题】：Plane fitting to 4 (or more) XYZ points平面拟合到 4 个（或更多）XYZ 点
【发布时间】：2021-11-13 09:44:34
【问题描述】：

我有 4 个点，它们非常接近在一个平面上 - 这是 1,4-二氢吡啶循环。

我需要计算从 C3 和 N1 到由 C1-C2-C4-C5 组成的平面的距离。计算距离是可以的，但拟合平面对我来说相当困难。

1,4-DHP 循环：

1,4-DHP循环，另一种说法：

from array import *
from numpy import *
from scipy import *

# coordinates (XYZ) of C1, C2, C4 and C5
x = [0.274791784, -1.001679346, -1.851320839, 0.365840754]
y = [-1.155674199, -1.215133985, 0.053119249, 1.162878076]
z = [1.216239624, 0.764265677, 0.956099579, 1.198231236]

# plane equation Ax + By + Cz = D
# non-fitted plane
abcd = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]

# creating distance variable
distance =  zeros(4, float)

# calculating distance from point to plane
for i in range(4):
    distance[i] = (x[i]*abcd[0]+y[i]*abcd[1]+z[i]*abcd[2]+abcd[3])/sqrt(abcd[0]**2 + abcd[1]**2 + abcd[2]**2)
    
print distance

# calculating squares
squares = distance**2

print squares

如何使总和（平方）最小化？我尝试过最小二乘，但它对我来说太有用了。

【问题讨论】：

尝试在 math.stackexchange 上提问？您似乎不需要编码帮助 atm :)
在这种情况下，我不确定提及“1,4-二氢吡啶循环”是否有帮助。你用谷歌搜索过“飞机拟合蟒蛇”吗？第五个结果看起来很有希望......
我写了一个类似的答案here 可能有用（忽略关于权重的最后一部分）
@MrE 链接的信息对于了解我的解决方案在幕后的作用至关重要。否则你只是在处理一个神奇的黑匣子。
@user1071136 - 您假设您的 Google 气泡与读者的 Google 气泡相同，而且气泡会随着时间的推移保持静止。两者都不是真的。链接比模糊的“你应该用谷歌搜索‘这个’然后点击第 n 个结果”更有帮助。为了证明我的观点，目前在 DuckDuckGo 上进行此类搜索的第一个结果就是 StackOverflow 上的这个问题。

标签： python geometry least-squares plane

【解决方案1】：

听起来不错，但您应该用 SVD 替换非线性优化。下面创建惯性矩张量 M，然后使用 SVD 得到平面的法线。这应该是最小二乘拟合的近似值，并且更快且更可预测。它返回点云中心和法线。

def planeFit(points):
    """
    p, n = planeFit(points)

    Given an array, points, of shape (d,...)
    representing points in d-dimensional space,
    fit an d-dimensional plane to the points.
    Return a point, p, on the plane (the point-cloud centroid),
    and the normal, n.
    """
    import numpy as np
    from numpy.linalg import svd
    points = np.reshape(points, (np.shape(points)[0], -1)) # Collapse trialing dimensions
    assert points.shape[0] <= points.shape[1], "There are only {} points in {} dimensions.".format(points.shape[1], points.shape[0])
    ctr = points.mean(axis=1)
    x = points - ctr[:,np.newaxis]
    M = np.dot(x, x.T) # Could also use np.cov(x) here.
    return ctr, svd(M)[0][:,-1]

例如：在 (10, 100) 处构造一个在 x 方向上很薄，在 y 方向上大 100 倍的二维云：

>>> pts = np.diag((.1, 10)).dot(randn(2,1000)) + np.reshape((10, 100),(2,-1))

拟合平面非常接近 (10, 100)，法线非常接近 x 轴。

>>> planeFit(pts)

    (array([ 10.00382471,  99.48404676]),
     array([  9.99999881e-01,   4.88824145e-04]))

【讨论】：

但是上钩者的回答非常准确；测量值以埃为单位（不需要百分之一的精度），我也没有那么多点 - 速度还可以。但这是看起来非常有趣的解决方案。
使用scipy.optimize.leastsq 很棒，但是（假设我没有添加错误），这是做最小二乘的正确方法。 en.wikipedia.org/wiki/Total_least_squares
另外：如果你真的想解决完整的非线性问题，使用基于 SVD 的拟合是获得非常好的起点的快速方法。
为什么会不那么精确？这实际上比非线性优化更精确。
我现在对此感到生疏了。我在想的是总最小二乘是平方误差的总和，而 SVD 解决方案会给你（我认为）平方 rsin(theta) 的总和。当然，对于小的theta，rsin(theta)非常接近欧几里得误差，但是对于大的误差，就没有那么多了。

【解决方案2】：

最小二乘应该很容易适合平面。平面的方程是：ax + by + c = z。所以用你的所有数据设置这样的矩阵：

    x_0   y_0   1  
A = x_1   y_1   1  
          ... 
    x_n   y_n   1

和

    a  
x = b  
    c

和

    z_0   
B = z_1   
    ...   
    z_n

换句话说：Ax = B。现在求解x，它们是你的系数。但是由于你有超过 3 个点，系统是超定的，所以你需要使用左伪逆。所以答案是：

a 
b = (A^T A)^-1 A^T B
c

下面是一些简单的 Python 代码和示例：

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np

N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET  = 5
EXTENTS = 5
NOISE = 5

# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
    zs.append(xs[i]*TARGET_X_SLOPE + \
              ys[i]*TARGET_y_SLOPE + \
              TARGET_OFFSET + np.random.normal(scale=NOISE))

# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')

# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
    tmp_A.append([xs[i], ys[i], 1])
    tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)

print("solution: %f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("errors:")
print(errors)
print("residual: {}".format(residual))

# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
                  np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
    for c in range(X.shape[1]):
        Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')

ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()

您的积分解决方案：

0.143509 x + 0.057196 y + 1.129595 = z

【讨论】：

感谢您的回答。当我问这个问题时，我不知道 def 函数是如何工作的。而且我找不到任何已经在使用它来完成我的任务的东西。
@Ben 等等，也许我看这段代码的时间不够长，但这个解决方案不是迭代的，对吧？同样，这只是第一印象，但这听起来不是很“机器学习”，因为缺乏更好的表达方式。这是否意味着这个解决方案只是把第一个猜测当作福音？
此方法对奇异矩阵不鲁棒。去试试其他人
@frank 这个解决方案不是迭代的，也不是机器学习。它只是ordinary least squares，它只是直接的数学运算，可以最大限度地减少模型的误差。不，这不适用于奇异矩阵，我不确定什么会。我认为这意味着问题的表述存在根本性的错误。
这在 Python3 / new numpy 中不起作用。很感激编辑，因为我自己做不到。

【解决方案3】：

您适合飞机的事实在这里仅略微相关。您正在尝试做的是从猜测开始最小化特定函数。为此使用scipy.optimize。请注意，不能保证这是全局最优解决方案，只能局部最优。不同的初始条件可能会收敛到不同的结果，如果您开始接近您正在寻找的局部最小值，这会很好。

我冒昧地利用 numpy 的广播来清理您的代码：

import numpy as np

# coordinates (XYZ) of C1, C2, C4 and C5
XYZ = np.array([
        [0.274791784, -1.001679346, -1.851320839, 0.365840754],
        [-1.155674199, -1.215133985, 0.053119249, 1.162878076],
        [1.216239624, 0.764265677, 0.956099579, 1.198231236]])

# Inital guess of the plane
p0 = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]

def f_min(X,p):
    plane_xyz = p[0:3]
    distance = (plane_xyz*X.T).sum(axis=1) + p[3]
    return distance / np.linalg.norm(plane_xyz)

def residuals(params, signal, X):
    return f_min(X, params)

from scipy.optimize import leastsq
sol = leastsq(residuals, p0, args=(None, XYZ))[0]

print("Solution: ", sol)
print("Old Error: ", (f_min(XYZ, p0)**2).sum())
print("New Error: ", (f_min(XYZ, sol)**2).sum())

这给出了：

Solution:  [  14.74286241    5.84070802 -101.4155017   114.6745077 ]
Old Error:  0.441513295404
New Error:  0.0453564286112

【讨论】：

即使超过 4 个点，这段代码也不会改变......不是吗？只需将坐标添加到第一个数组....
@Hooked 你说局部最优，但是有没有办法保证全局最优解而不考虑初始条件？我对自己的线性代数了解不够
@frank 通常答案是否定的，对于任意成本函数，无法保证您处于全局最小值与局部最小值。但是，有一个重要的（线性）问题子集，我们可以保证并在多项式时间内找到解决方案。这实际上是线性代数的主要优势之一。许多非线性问题可以近似为线性问题，因此您可以完全 FWIW 解决近似问题。
不应该是应该最小化的绝对距离吗？
@NiranjanKotha 这取决于！ L1 和 L2 范数（绝对值 vs 均方值）总是给出相同的排名，但衡量您离目标的距离的不同。对于迭代求解器，这意味着它们具有相同的最小值（因此具有相同的“正确”答案），但梯度不同。不同的梯度允许一些求解器更快地得到解决方案。在许多情况下，但不是全部情况下，L2 规范收敛得更快。

【解决方案4】：

在处理异常值时（当您拥有大型数据集时），除了 svd 之外，另一种快速找到解决方案的方法是 ransac：

def fit_plane(voxels, iterations=50, inlier_thresh=10):  # voxels : x,y,z
    inliers, planes = [], []
    xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
    z = voxels[:, -1].reshape(-1, 1)
    for _ in range(iterations):
        random_pts = voxels[np.random.choice(voxels.shape[0], voxels.shape[1] * 10, replace=False), :]
        plane_transformation, residual = fit_pts_to_plane(random_pts)
        inliers.append(((z - np.matmul(xy1, plane_transformation)) <= inlier_thresh).sum())
        planes.append(plane_transformation)
    return planes[np.array(inliers).argmax()]


def fit_pts_to_plane(voxels):  # x y z  (m x 3)
    # https://math.stackexchange.com/questions/99299/best-fitting-plane-given-a-set-of-points
    xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
    z = voxels[:, -1].reshape(-1, 1)
    fit = np.matmul(np.matmul(np.linalg.inv(np.matmul(xy1.T, xy1)), xy1.T), z)
    errors = z - np.matmul(xy1, fit)
    residual = np.linalg.norm(errors)
    return fit, residual

【讨论】：

体素应该给出非常接近的结果，但看起来需要更多的编码！
我不确定你的意思，但体素相当于原始问题中的 XYZ - 你不需要以某种方式对它们进行预处理。

【解决方案5】：

这将返回 3D 平面系数以及拟合的 RMSE。

平面以齐次坐标表示形式提供，这意味着它与点的齐次坐标的点积产生两者之间的距离。

def fit_plane(points):
    assert points.shape[1] == 3
    centroid = points.mean(axis=0)
    x = points - centroid[None, :]
    U, S, Vt = np.linalg.svd(x.T @ x)
    normal = U[:, -1]
    origin_distance = normal @ centroid
    rmse = np.sqrt(S[-1] / len(points))
    return np.hstack([normal, -origin_distance]), rmse

小提示：SVD 也可以直接应用于点而不是外积矩阵，但我发现使用 NumPy 的 SVD 实现会更慢。

U, S, Vt = np.linalg.svd(x.T, full_matrices=False)
rmse = S[-1] / np.sqrt(len(points))

【讨论】：

【解决方案6】：

这是一种方法。如果您的点是 P[1]..P[n]，则计算这些点的平均值 M 并从每个点中减去它，得到点 p[1]..p[n]。然后计算 C = Sum{ p[i]*p[i]'} （点的“协方差”矩阵）。接下来对角化 C，即找到正交 U 和对角 E，使得 C = U*E*U'。如果您的点确实在一个平面上，那么特征值之一（即 E 的对角线条目）将非常小（使用完美的算术它将为 0）。在任何情况下，如果其中第 j 列是最小的，则令 U 的第 j 列为 (A,B,C) 并计算 D = -M'*N。这些参数定义了“最佳”平面，即从 P[] 到该平面的距离的平方和最小。

【讨论】：

这是非常快的方法，但我需要最小二乘法。