我有一个xarray数据集ds1(也可以是DataArray),它看起来像这样(下面的代码):
ds1
# <xarray.Dataset>
# Dimensions: (cell: 6, time: 1, band: 12)
# Coordinates:
# latitude (cell) float32 51.25 51.75 52.25 52.75 53.25 53.75
# longitude (cell) float32 7.75 7.75 7.75 7.75 7.75 7.75
# * cell (cell) int64 27410 27411 27412 27413 27414 27415
# * time (time) datetime64[ns] 2022-12-31
# * band (band) int64 0 1 2 3 4 5 6 7 8 9 10 11
# Data variables:
# landuse (cell, time, band) float64 0.2819 0.1255 0.5831 ... 0.7683 0.534
# harvest (cell, time) float64 1.0 1.0 1.0 1.0 1.0 1.0
现在,这种格式不是经度、纬度矩阵的形式,它可以作为二维地图写入NetCDF,也不是根据纬度和经度进行选择的地方(最近的邻居等)。所以我想把它转换成这样的网格(lat-lon矩阵):
ds2
# <xarray.Dataset>
# Dimensions: (longitude: 720, latitude: 280, band: 12, time: 1)
# Coordinates:
# * longitude (longitude) float64 -179.8 -179.3 -178.8 ... 178.8 179.3 179.8
# * latitude (latitude) float64 -55.75 -55.25 -54.75 ... 82.75 83.25 83.75
# * band (band) int64 0 1 2 3 4 5 6 7 8 9 10 11
# * time (time) datetime64[ns] 2022-12-31
# Data variables:
# landuse (longitude, latitude, band, time) float64 nan nan nan ... nan nan
# harvest (longitude, latitude, time) float64 nan nan nan ... nan nan nan
问题是我找不到一种方法来正确地转换它。
以下是我生成这两个代码的示例,以及我之前尝试转换它的示例。。。
import numpy as np
import xarray as xr
# My output data set --------------------------------------------------------- #
# Define coordinates
cell = np.arange(27410, 27416)
lat = np.array([51.25, 51.75, 52.25, 52.75, 53.25, 53.75], dtype=np.float32)
lon = np.full(6, 7.75, dtype=np.float32)
band = np.arange(12)
time = np.array(['2022-12-31'], dtype='datetime64[ns]')
# Create empty dataset
ds1 = xr.Dataset()
ds1 = ds1.assign_coords(latitude=('cell', lat),
longitude=('cell', lon),
cell=cell,
time=time,
band=band)
ds1 = ds1.set_coords(['latitude', 'longitude'])
# Create empty data variables
landuse = xr.DataArray(
np.full((6, 1, 12), np.nan),
dims=['cell', 'time', 'band']
)
harvest = xr.DataArray(np.full((6, 1), np.nan), dims=['cell', 'time'])
# Assign data variables to dataset
ds1['landuse'] = landuse
ds1['harvest'] = harvest
# Assign values to the data variables
ds1['landuse'][:, :, :] = np.random.rand(6, 1, 12)
ds1['harvest'][:, :] = np.random.randint(2, size=(6, 1))
# Dataset with latitude and longitude matrix --------------------------------- #
# Define dimensions and coordinates
lon = np.linspace(-179.8, 179.8, 720)
lat = np.linspace(-55.75, 83.75, 280)
# Create empty dataset
ds2 = xr.Dataset()
ds2 = ds2.assign_coords(longitude=lon,
latitude=lat,
band=band,
time=time)
landuse = xr.DataArray(np.full((720, 280, 12, 1), np.nan),
dims=['longitude', 'latitude', 'band', 'time'])
harvest = xr.DataArray(np.full((720, 280, 1), np.nan),
dims=['longitude', 'latitude', 'time'])
ds2['landuse'] = landuse
ds2['harvest'] = harvest
我以前尝试转换它。。。
ds3 = ds2.where(
(ds1.longitude == ds2.longitude) & (ds1.latitude == ds2.longitude),
ds1
)
结果如下:
d3
# <xarray.Dataset>
# Dimensions: (longitude: 720, latitude: 280, band: 12, time: 1, cell: 6)
# Coordinates:
# * longitude (longitude) float64 -179.8 -179.3 -178.8 ... 178.8 179.3 179.8
# * latitude (latitude) float64 -55.75 -55.25 -54.75 ... 82.75 83.25 83.75
# * band (band) int64 0 1 2 3 4 5 6 7 8 9 10 11
# * time (time) datetime64[ns] 2022-12-31
# * cell (cell) int64 27410 27411 27412 27413 27414 27415
# Data variables:
# landuse (longitude, latitude, band, time, cell) float64 0.9636 ... 0.3648
# harvest (longitude, latitude, time, cell) float64 1.0 1.0 0.0 ... 0.0 0.0
但理想情况下应该是这样的(没有单元尺寸):
# <xarray.Dataset>
# Dimensions: (longitude: 720, latitude: 280, band: 12, time: 1)
# Coordinates:
# * longitude (longitude) float64 -179.8 -179.3 -178.8 ... 178.8 179.3 179.8
# * latitude (latitude) float64 -55.75 -55.25 -54.75 ... 82.75 83.25 83.75
# * band (band) int64 0 1 2 3 4 5 6 7 8 9 10 11
# * time (time) datetime64[ns] 2022-12-31
# Data variables:
# landuse (longitude, latitude, band, time, cell) float64 0.9636 ... 0.3648
# harvest (longitude, latitude, time, cell) float64 1.0 1.0 0.0 ... 0.0 0.0
我对这里的一个合适的解决方案感兴趣,这个解决方案也更短,理想情况下,只有单元维度应该与经度、纬度(可能是延伸的经度和纬度)切换。