在 dtype 中构造具有重叠字段的 np.array答案

【问题标题】：Constructing np.array with overlapping fields in dtype在 dtype 中构造具有重叠字段的 np.array
【发布时间】：2016-07-25 17:32:30
【问题描述】：

我的数据类型如下：

pose_dtype = np.dtype([('x', np.float64), ('y', np.float64), ('theta', np.float64)])

现在，我可以写了：

pose = np.array((1, 2, np.pi), dtype=pose_dtype)

我想添加一个xy 字段以使其更易于使用。我可以这样做：

pose_dtype = np.dtype(dict(
    names=['x', 'y', 'theta', 'xy'],
    formats=[np.float64, np.float64, np.float64, (np.float64, 2)],
    offsets=[0, 8, 16, 0]
))

但是，现在我不能再使用以前的方法构造数组了，不得不求助于：

pose = np.array((1, 2, np.pi, [1, 2]), dtype=pose_dtype)

这是危险的重复。

有什么方法可以将属性标记为彼此的别名，这样我就不必处理这个问题了？

【问题讨论】：

按字段而不是按记录填充数组怎么样？ dtype offsets 是关于 SO 的一个相当罕见的话题。我将不得不尝试寻找替代方案。
但是你知道pose[['x','y']]可以访问多个字段吗？没有任何别名。
@MSeifert：但是这样做的结果是结构化类型，而不是非结构化类型 - 我不能在结果上使用 np.dot

标签： python numpy structured-array numpy-dtype

【解决方案1】：

按字段而不是按记录填充数组的实验

In [207]: pose_dtype = np.dtype(dict(
    names=['x', 'y', 'theta', 'xy'],
    formats=[np.float64, np.float64, np.float64, (np.float64, 2)],
    offsets=[0, 8, 16, 0]
))

In [209]: A=np.zeros((3,),dtype=pose_dtype)
In [210]: A
Out[210]: 
array([(0.0, 0.0, 0.0, [0.0, 0.0]), (0.0, 0.0, 0.0, [0.0, 0.0]),
       (0.0, 0.0, 0.0, [0.0, 0.0])], 
      dtype={'names':['x','y','theta','xy'], 'formats':['<f8','<f8','<f8',('<f8', (2,))], 'offsets':[0,8,16,0], 'itemsize':24})
In [211]: A['x']=[1,2,3]
In [212]: A
Out[212]: 
array([(1.0, 0.0, 0.0, [1.0, 0.0]), (2.0, 0.0, 0.0, [2.0, 0.0]),
       (3.0, 0.0, 0.0, [3.0, 0.0])], 
      dtype={'names':['x','y','theta','xy'], 'formats':['<f8','<f8','<f8',('<f8', (2,))], 'offsets':[0,8,16,0], 'itemsize':24})
In [213]: A['y']=[4,5,6]
In [214]: A
Out[214]: 
array([(1.0, 4.0, 0.0, [1.0, 4.0]), (2.0, 5.0, 0.0, [2.0, 5.0]),
       (3.0, 6.0, 0.0, [3.0, 6.0])], 
      dtype={'names':['x','y','theta','xy'], 'formats':['<f8','<f8','<f8',('<f8', (2,))], 'offsets':[0,8,16,0], 'itemsize':24})
In [215]: A['xy']
Out[215]: 
array([[ 1.,  4.],
       [ 2.,  5.],
       [ 3.,  6.]])
In [216]: A['xy']=np.arange(10,16).reshape(3,2)
In [217]: A
Out[217]: 
array([(10.0, 11.0, 0.0, [10.0, 11.0]), (12.0, 13.0, 0.0, [12.0, 13.0]),
       (14.0, 15.0, 0.0, [14.0, 15.0])], 
      dtype={'names':['x','y','theta','xy'], 'formats':['<f8','<f8','<f8',('<f8', (2,))], 'offsets':[0,8,16,0], 'itemsize':24})

In [219]: A['xy'].dot(A['xy'].T)
Out[219]: 
array([[ 221.,  263.,  305.],
       [ 263.,  313.,  363.],
       [ 305.,  363.,  421.]])

另一种将 2 个字段作为浮点数组获取的方法（不漂亮）

In [228]: A[['x','y']].view(float).reshape(-1,2)
Out[228]: 
array([[ 10.,  11.],
       [ 12.,  13.],
       [ 14.,  15.]])

【讨论】：

x2[['x', 'y']].view((np.float64, 2)) 完成了获取这两个字段的工作，事实证明
我没想过在view 中使用这样的复合数据类型。它清除了reshape 的麻烦——尽管速度是一样的。您的“xy”字段访问速度更快。