【发布时间】:2020-10-24 15:22:17
【问题描述】:
我在理解 xarray.groupby 的真正工作原理方面遇到了严重困难。我正在尝试在 xarray DatasetGroupBy 集合的每组上应用给定函数“f”,这样“f”应该将新变量添加到原始 xr.DataSet 的每个应用组。
这里是一个简介:
我的问题常见于地球科学、遥感等领域。
目的是在一个数组上应用一个给定的函数,逐个像素(或逐个网格单元)。
示例
假设我想评估给定区域的风场相对于新方向的风速分量 (u,v)。因此,我希望评估 'u' 和 'v 组件的旋转版本,即:u_rotated 和 v_rotated。
假设这个新方向相对于风场中的每个像素位置逆时针旋转了 30°。所以新的风分量是(u_30_degrees 和 v_30_degrees)。
我的第一次尝试是将每个 x 和 y 坐标(或经度和纬度)堆叠到一个称为像素的新维度中,然后按这个新维度(“像素”)分组并应用一个函数来执行向量- 风旋转。
这是我最初尝试的 sn-p:
# First, let's create some functions for vector rotation:
def rotate_2D_vector_per_given_degrees(array2D, angle=30):
'''
Parameters
----------
array2D : 1D length 2 numpy array
angle : float angle in degrees (optional)
DESCRIPTION. The default is 30.
Returns
-------
Rotated_2D_Vector : 1D of length 2 numpy array
'''
R = get_rotation_matrix(rotation = angle)
Rotated_2D_Vector = np.dot(R, array2D)
return Rotated_2D_Vector
def get_rotation_matrix(rotation=90):
'''
Description:
This function creates a rotation matrix given a defined rotation angle (in degrees)
Parameters:
rotation: in degrees
Returns:
rotation matrix
'''
theta = np.radians(rotation) # degrees
c, s = np.cos(theta), np.sin(theta)
R = np.array(((c, -s), (s, c)))
return R
# Then let's create a reproducible dataset for analysis:
u_wind = xr.DataArray(np.ones( shape=(20, 30)),
dims=('x', 'y'),
coords={'x': np.arange(0, 20),
'y': np.arange(0, 30)},
name='u')
v_wind = xr.DataArray(np.ones( shape=(20, 30))*0.3,
dims=('x', 'y'),
coords={'x': np.arange(0, 20),
'y': np.arange(0, 30)},
name='v')
data = xr.merge([u_wind, v_wind])
# Let's create the given function that will be applied per each group in the dataset:
def rotate_wind(array, degrees=30):
# This next line, I create a 1-dimension vector of length 2,
# with wind speed of the u and v components, respectively.
# The best solution I found has been conver the dataset into a single xr.DataArray
# by stacking the 'u' and 'v' components into a single variable named 'wind'.
vector = array.to_array(dim='wind').values
# Now, I rotate the wind vector given a rotation angle in degrees
Rotated = rotate_2D_vector_per_given_degrees(vector, degrees)
# Ensuring numerical division problems as 1e-17 == 0.
Rotated = np.where( np.abs(Rotated - 6.123234e-15) < 1e-15, 0, Rotated)
# sanity check for each point position:
print('Coords: ', array['point'].values,
'Wind Speed: ', vector,
'Response :', Rotated,
end='\n\n'+'-'*20+'\n')
components = [a for a in data.variables if a not in data.dims]
for dim, value in zip(components, Rotated):
array['{0}_rotated_{1}'.format(dim, degrees)] = value
return array
# Finally, lets stack our dataset per grid-point, groupby this new dimension, and apply the desired function:
stacked = data.stack(point = ['x', 'y'])
stacked = stacked.groupby('point').apply(rotate_wind)
# lets unstack the data to recover the original dataset:
data = stacked.unstack('point')
# Let's check if the function worked correctly
data.to_dataframe().head(30)
虽然上面的例子显然有效,但我仍然不确定它的结果是否正确,或者即使 groupby-apply 函数实现是否有效(干净、非冗余、快速等)。
欢迎任何见解!
此致,
【问题讨论】:
标签: python netcdf python-xarray