根据python中的纬度值拆分netcdf4文件答案

【问题标题】：Split netcdf4 file based on latitude values in python根据python中的纬度值拆分netcdf4文件
【发布时间】：2022-02-10 06:13:31
【问题描述】：

我有一个包含三个变量的 netCDF 文件：

input.variables

{'longitude': <class 'netCDF4._netCDF4.Variable'>
 float32 longitude(longitude)

'latitude': <class 'netCDF4._netCDF4.Variable'>
 float32 latitude(latitude)

'Values': <class 'netCDF4._netCDF4.Variable'>
 int16 Values(time, latitude, longitude)

{'time':
 Length: 504, dtype: datetime64[ns],
}

我想在北半球和南半球拆分这个 netcdf 文件。我猜这是一个简单的布尔索引问题，但这根本不可能。我已经尝试如下：

south_mask = input[input.variables['latitude'][:] < 0]
north_mask = input[input.variables['latitude'][:] >= 0]

但这不起作用：

TypeError: expected str, bytes or os.PathLike object, not MaskedArray

我在这个特定项目中也需要完整的 netcdf 文件，所以我想在 python 中执行此操作，而不是在外部使用命令行。

目前我正在这样做：

middle_index = input.variables['longitude'][:].tolist().index(0.)

# create empty df
N_df = pd.DataFrame(columns = ['Time', 'Value'])
S_df = pd.DataFrame(columns = ['Time', 'Value'])

# Index based on values
for i in range(0, 504):
    # Get time
    time = input['time'][i]

    # Then loop and index from beginning to middle index, and middle index to end.
    N = np.average(input['Value'][i, :middle_index, :]) 
    S = np.average(input['Value'][i, middle_index:, :]) 

    # then making two dataframes and appending to two
    N_df_tmp = pd.DataFrame({'Time' : [time], 
                                     'Value': [Value]})
    S_df_tmp = pd.DataFrame({'Time' : [time], 
                                     'Value': [Value]}) 
    # Resulting in two dataframes.
    S_df = S_df.append(S_df_tmp)
    N_df = N_df.append(N_df_tmp)

但我觉得必须有一种更简单的方法将数据集一分为二，即使没有循环。

【问题讨论】：

标签： python netcdf netcdf4

【解决方案1】：

您应该查看xarray。

import xarray as xr
ds = xr.open_dataset("globe.nc")
northern = ds.sel(latitude=slice(0, 90))
southern = ds.sel(latitude=slice(-90, 0))

# if latitude in decreasing order:
# northern = ds.sel(latitude=slice(90, 0))

如果需要写文件，看to_netcdf()方法。但是命令行工具会更好。

【讨论】：