Managing Information Loss with xarray operations

Sometimes, you can lose important information from your dataset when performing operations. You will likely want to keep track of the attributes, nodata, and CRS.

API Reference:

Note that write_transform is only needed if you are not saving the x,y coordinates. It is for GDAL to be able to read in the transform without needing the original coordinates and is useful if you read in the file with parse_coordinates=False.

[1]:
import rioxarray
import xarray

See docs for rioxarray.open_rasterio

[2]:
rds = rioxarray.open_rasterio(
    "../../test/test_data/input/PLANET_SCOPE_3D.nc",
    variable=["green"],
    mask_and_scale=True,
)

Notice the original data:

[3]:
rds.green.attrs, rds.green.encoding, rds.green.rio.crs, rds.green.rio.nodata
[3]:
({'nodata': 0, 'units': ('DN', 'DN')},
 {'dtype': 'float64',
  'grid_mapping': 'spatial_ref',
  'scale_factor': 1.0,
  'add_offset': 0.0,
  '_FillValue': nan,
  'source': 'netcdf:../../test/test_data/input/PLANET_SCOPE_3D.nc:green'},
 CRS.from_epsg(32722),
 nan)

Notice how information is lost in the operation:

[4]:
new_ds = rds.green + rds.green
new_ds.attrs, new_ds.encoding, new_ds.rio.crs, new_ds.rio.nodata
[4]:
({}, {}, CRS.from_epsg(32722), None)

To preserve attributes, xarray has set_options with keep_attrs=True. However, it does not preserve the encoding.

[5]:
with xarray.set_options(keep_attrs=True):
    new_ds = rds.green + rds.green
new_ds.attrs, new_ds.encoding, new_ds.rio.crs, new_ds.rio.nodata
[5]:
({'nodata': 0, 'units': ('DN', 'DN')}, {}, CRS.from_epsg(32722), 0.0)

Another solution is to save the original attributes and then copy them over once the operation is complete:

[6]:
new_ds = rds.green + rds.green
new_ds.rio.write_crs(rds.green.rio.crs, inplace=True)
new_ds.rio.update_attrs(rds.green.attrs, inplace=True)
new_ds.rio.update_encoding(rds.green.encoding, inplace=True)
new_ds.attrs, new_ds.encoding, new_ds.rio.crs, new_ds.rio.nodata
[6]:
({'nodata': 0, 'units': ('DN', 'DN')},
 {'grid_mapping': 'spatial_ref',
  'dtype': 'float64',
  'scale_factor': 1.0,
  'add_offset': 0.0,
  '_FillValue': nan,
  'source': 'netcdf:../../test/test_data/input/PLANET_SCOPE_3D.nc:green'},
 CRS.from_epsg(32722),
 nan)
[7]:
new_ds.rio.to_raster("combination_keep_attrs.tif")
[8]:
!rio info combination_keep_attrs.tif
{"bounds": [466266.0, 8084670.0, 466296.0, 8084700.0], "colorinterp": ["gray", "undefined"], "count": 2, "crs": "EPSG:32722", "descriptions": ["green", "green"], "driver": "GTiff", "dtype": "float64", "height": 10, "indexes": [1, 2], "interleave": "pixel", "lnglat": [-51.31732641226951, -17.322997474192466], "mask_flags": [["nodata"], ["nodata"]], "nodata": NaN, "res": [3.0, 3.0], "shape": [10, 10], "tiled": false, "transform": [3.0, 0.0, 466266.0, 0.0, -3.0, 8084700.0, 0.0, 0.0, 1.0], "units": [null, null], "width": 10}