【问题标题】:Unable to use numpy.isin function in xarray DataArray无法在 xarray DataArray 中使用 numpy.isin 函数
【发布时间】:2019-07-14 19:08:11
【问题描述】:

我有一个名为 da_temp 的 xarray DataArray。它的尺寸为timelatitudelongitude

time 维度是每小时数据,其坐标从 2009-01-01T00:00:00 到 2009-12-31T23:00:00。

<xarray.DataArray 'T_SFC' (time: 8760, latitude: 106, longitude: 193)>
array([[[13.609406, 13.613037, ..., 16.048492, 16.065308],
        [13.634644, 13.638123, ..., 16.062683, 16.07788 ],
        ...,
        [21.507715, 21.5213  , ..., 20.858679, 21.26879 ],
        [21.344576, 21.364063, ..., 20.826727, 21.127975]],

       [[13.616699, 13.617249, ..., 16.098816, 16.11792 ],
        [13.639832, 13.63916 , ..., 16.12085 , 16.137512],
        ...,
        [23.225473, 23.222776, ..., 21.637125, 22.041592],
        [23.069414, 23.077639, ..., 21.623928, 21.91626 ]],

       ...,

       [[14.325562, 14.323853, ..., 19.051544, 19.004974],
        [14.305298, 14.306183, ..., 19.011719, 18.970123],
        ...,
        [24.70183 , 24.588383, ..., 18.064957, 18.364614],
        [25.497314, 24.317244, ..., 17.94168 , 18.119926]],

       [[14.595551, 14.583862, ..., 18.971344, 18.976532],
        [14.529541, 14.525635, ..., 18.963531, 18.966766],
        ...,
        [25.039097, 24.982517, ..., 19.516224, 19.818233],
        [25.338848, 25.875378, ..., 19.330383, 19.520332]]], dtype=float32)
Coordinates:
  * latitude   (latitude) float32 -39.2 -39.149525 ... -33.950478 -33.9
  * longitude  (longitude) float32 140.8 140.84792 140.89584 ... 149.95209 150.0
  * time       (time) datetime64[ns] 2009-01-01 ... 2009-12-31T23:00:00
Attributes:
    projectionType:  MERCATOR
    level:           SFC
    units:           C
    gridType:        SCALAR
    lonCentre:       145.4
    latIntersect:    0.0

我只想选择一年中前 6 个月的数据,即一月、二月、三月、四月、五月和六月。我试图使用np.isin 检查time 中的月份是否坐标在 [1, 2, 3, 4, 5, 6] 列表​​中,但收到错误。

import numpy as np
import xarray as xr
MTHS = [1, 2, 3, 4, 5, 6]
da_mask = xr.where(np.isin(da_temp.time.dt.month.values, MTHS), da_temp, 0)

ValueError                                Traceback (most recent call last)
<ipython-input-72-7fee7a634922> in <module>
----> 1 da_autumn = xr.where(np.isin(da_temp.time.dt.month.values, autumn_mth_list), da_temp, 0)
      2 da_autumn.values

~\anaconda3\envs\PlannedBurnWindow\lib\site-packages\xarray\core\computation.py in where(cond, x, y)
   1115                        join='exact',
   1116                        dataset_join='exact',
-> 1117                        dask='allowed')

~\anaconda3\envs\PlannedBurnWindow\lib\site-packages\xarray\core\computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, *args)
    967                                      join=join,
    968                                      exclude_dims=exclude_dims,
--> 969                                      keep_attrs=keep_attrs)
    970     elif any(isinstance(a, Variable) for a in args):
    971         return variables_vfunc(*args)

~\anaconda3\envs\PlannedBurnWindow\lib\site-packages\xarray\core\computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
    216 
    217     data_vars = [getattr(a, 'variable', a) for a in args]
--> 218     result_var = func(*data_vars)
    219 
    220     if signature.num_outputs > 1:

~\anaconda3\envs\PlannedBurnWindow\lib\site-packages\xarray\core\computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, output_sizes, keep_attrs, *args)
    563             raise ValueError('unknown setting for dask array handling in '
    564                              'apply_ufunc: {}'.format(dask))
--> 565     result_data = func(*input_data)
    566 
    567     if signature.num_outputs == 1:

~\anaconda3\envs\PlannedBurnWindow\lib\site-packages\xarray\core\duck_array_ops.py in where(condition, x, y)
    191 def where(condition, x, y):
    192     """Three argument where() with better dtype promotion rules."""
--> 193     return _where(condition, *as_shared_dtype([x, y]))
    194 
    195 

~\anaconda3\envs\PlannedBurnWindow\lib\site-packages\xarray\core\duck_array_ops.py in f(*args, **kwargs)
     42             else:
     43                 wrapped = getattr(eager_module, name)
---> 44             return wrapped(*args, **kwargs)
     45     else:
     46         def f(*args, **kwargs):

ValueError: operands could not be broadcast together with shapes (8760,) (8760,106,193) () 

【问题讨论】:

  • 向第一个操作数添加几个单位尺寸
  • @MadPhysicist 你能澄清一下吗?
  • 根据错误,(8760,) (8760,106,193) () 不会一起广播,因为尺寸右对齐(8760 匹配 193)。 (8760, 1, 1) (8760,106,193) () 广播就好了。
  • 你能指出如何解决它吗?错误消息似乎误导了我。 da_temp.time.dt.month 只是一个包含 8760 个元素的维度。

标签: python numpy python-xarray


【解决方案1】:

如果您只想提取一部分时间,这应该可以解决问题。

airtemps = xr.tutorial.open_dataset('air_temperature')

airtemps['air'].sel(time=slice('2013-01-01','2013-06-30'))

【讨论】:

  • 这会切分时间维度并使时间坐标为 4344。我想保留 01-01 和 06-31 之间所有日期的值,并将 0 分配给 07-01 和12-31.
  • @alextc 这与最初提出的问题不同。您可以以类似的方式操作数据的时间片:shp = da_temp['T_SFC'].sel(time=slice('2009-07-01', '2009-12-31').shapeda_temp['T_SFC'].sel(time=slice('2009-07-01', '2009-12-31').values[:,:,:] = np.zeros(shp)
【解决方案2】:

问题在于 np.isin 返回一个 NumPy 数组(即没有标记的维度),因此 xarray 无法自动确定如何在 where 内适当地广播它。

我建议改用 xarray 的内置 isin 方法,该方法适用于(并返回)标记为 DataArrays:

da_mask = xr.where(da_temp.time.dt.month.isin(MTHS), da_temp, 0)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-10-29
    • 1970-01-01
    • 2020-10-26
    • 2019-01-18
    • 2020-07-15
    • 2017-06-12
    • 2017-01-30
    • 2022-01-17
    相关资源
    最近更新 更多