目录
简介
广播指的是不同形状数组之间的算术运算的执行方式。它是一种非常强大的功能
>>> import numpy as np
>>> arr = np.arange(5)
>>> arr
array([0, 1, 2, 3, 4])
>>> arr*4
array([ 0, 4, 8, 12, 16])
>>> arr = randn(4,3)
>>> arr.mean(0)
array([-0.43115904, -0.04193137, -0.55604705])
>>> demeaned = arr-arr.mean(0)
>>> demeaned
array([[ 0.28172111, -0.03730437, 0.15310302],
[-0.65851004, -2.0032837 , -0.50386558],
[ 0.90069608, 1.24842358, 1.39446594],
[-0.52390714, 0.79216448, -1.04370338]])
沿其他轴向广播
根据广播的原则,较小数组的广播维必须为1
于是就有一个非常普遍的问题,即专门为了广播而添加一个长度为一的新轴,办法为使用全切片来插入新轴
>>> arr = np.zeros((4,4))
>>> arr_3d = arr[:,np.newaxis,:]
>>> arr_3d.shape
(4L, 1L, 4L)
>>> arr_1d = np.random.normal(size=3)
>>> arr_1d[:,np.newaxis]
array([[-2.28070615],
[ 1.45676407],
[-0.33501704]])
>>> arr_1d[np.newaxis,:]
array([[-2.28070615, 1.45676407, -0.33501704]])
如果我们有三维数组,并希望对轴2进行距平化,那么只需要编写下面这样的代码即可。
>>> arr = randn(3,4,5)
>>> arr
array([[[ 0.14355889, 0.91471291, -1.48453369, -0.42455771,
1.35064972],
[-1.6315615 , -0.28652536, -0.22573702, -0.02694076,
-0.43670708],
[-0.30708186, 0.44396068, -0.20062503, -0.35479536,
-0.67984063],
[ 1.08344408, -0.48893 , 0.16412084, 1.23980658,
-0.6989953 ]],
[[ 0.23317469, 0.91435539, 0.85500396, -0.76785453,
-1.06678174],
[ 0.04440929, 1.47382479, -0.29530241, -0.12223287,
0.58623605],
[ 1.21833979, 1.25049874, 0.60785443, 0.98217354,
-0.31913215],
[-1.48977066, -1.21858972, 0.16373654, -1.13013772,
-0.19680737]],
[[-1.09171845, -0.30966571, 0.74153601, 1.07080261,
-0.89091987],
[ 0.66661656, -0.61741408, -0.64890451, -0.23578853,
0.45262463],
[-0.57686182, -0.62548638, 0.12317945, 0.23243517,
-1.67882355],
[ 0.1218588 , 0.92471459, 0.68252696, 0.4598333 ,
0.29009011]]])
>>> depth_means = arr.mean(2)
>>> depth_means
array([[ 0.09996603, -0.52149434, -0.21967644, 0.25988924],
[ 0.03357955, 0.33738697, 0.74794687, -0.77431379],
[-0.09599308, -0.07657319, -0.50511143, 0.49580475]])
>>> demeaned = arr - depth_means[:,:,np.newaxis]
>>> demeaned.mean(2)
array([[ 0.00000000e+00, -6.66133815e-17, -3.33066907e-17,
-4.44089210e-17],
[ 0.00000000e+00, 6.66133815e-17, -4.44089210e-17,
1.11022302e-16],
[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
-3.33066907e-17]])
通过广播设置数组的值
>>> arr = np.zeros((4,3))
>>> arr[:] = 5
>>> arr
array([[5., 5., 5.],
[5., 5., 5.],
[5., 5., 5.],
[5., 5., 5.]])
>>> col = np.array([1.28,-0.42,0.44,1.6])
>>> arr[:] = col[:,np.newaxis]
>>> arr
array([[ 1.28, 1.28, 1.28],
[-0.42, -0.42, -0.42],
[ 0.44, 0.44, 0.44],
[ 1.6 , 1.6 , 1.6 ]])
>>> arr[:2] = [[-1.37],[0.509]]
>>> arr
array([[-1.37 , -1.37 , -1.37 ],
[ 0.509, 0.509, 0.509],
[ 0.44 , 0.44 , 0.44 ],
[ 1.6 , 1.6 , 1.6 ]])
ufunc高级应用
>>> arr = np.arange(10)
>>> np.add.reduce(arr)
45
>>> arr.sum()
45
arr[::2]对每个第三行进行操作
>>> arr = randn(5,5)
>>> arr[::2].sort(1)
>>> arr
array([[-1.77863628, -1.27039208, -0.90102076, -0.47784934, 1.53919358],
[-0.06052154, -0.47682076, 0.65163751, -0.95954794, -3.60159995],
[-0.50752957, -0.50483839, -0.05569555, 0.34798383, 1.69232614],
[-0.05763389, -0.54338055, 0.93306287, 0.9971338 , -0.87589324],
[-2.07027641, -1.51301557, -0.54804822, 0.06283581, 0.69473061]])
>>> arr[:,:-1] < arr[:,1:]
array([[ True, True, True, True],
[False, True, False, False],
[ True, True, True, True],
[False, True, True, False],
[ True, True, True, True]])
logical_and.reduce与all方法等价
>>> np.logical_and.reduce(arr[:,:-1] < arr[:,1:],axis=1)
array([ True, False, True, False, True])
outer用于计算两个数组的差积,out输出的结果是两个维度之和
>>> arr = np.arange(3).repeat([1,2,2])
>>> arr
array([0, 1, 1, 2, 2])
>>> np.multiply.outer(arr,np.arange(5))
array([[0, 0, 0, 0, 0],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 4, 6, 8],
[0, 2, 4, 6, 8]])
自定义的ufunc
两种方法
>>> def add_elements(x,y):
... return x + y
...
>>> add_them = np.frompyfunc(add_elements,2,1)
>>> add_them(np.arange(8),np.arange(8))
array([0, 2, 4, 6, 8, 10, 12, 14], dtype=object)
>>> add_them = np.vectorize(add_elements,otypes=[np.float64])
>>> add_them(np.arange(8),np.arange(8))
array([ 0., 2., 4., 6., 8., 10., 12., 14.])
架构化和记录式数组
结构化数组是一种特殊的ndarray,其中的各个元素可以被看作C语言中的结构体或SQL中带有多个命名字段的行
>>> dtype = [('x',np.float64),('y',np.int32)]
>>> sarr = np.array([(1.5,6),(np.pi,-2)],type = dtype)
>>> sarr = np.array([(1.5,6),(np.pi,-2)],dtype = dtype)
>>> sarr
array([(1.5 , 6), (3.14159265, -2)],
dtype=[('x', '<f8'), ('y', '<i4')])
定义架构化dtype的方式有很多,最典型的是元组列表,各元组的格式为(field_name,field_data_type).这样数组元素就成了元组式的对象。该对象可以向字典那样访问。
下例:以类型为索引
>>> sarr[0]
(1.5, 6)
>>> sarr[0]['y']
6