pydata / xarray

N-D labeled arrays and datasets in Python

Home Page:https://xarray.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DataTree.identical should consider checking that coordinates are defined at the same level

shoyer opened this issue · comments

What is your issue?

Consider the following two data trees:

one = DataTree.from_dict(
    {
        "/": xr.Dataset(coords={"a": 1}),
        "/b": xr.Dataset(coords={"a": 1}),
    }
)
two = DataTree.from_dict(
    {
        "/": xr.Dataset(coords={"a": 1}),
        "/b": xr.Dataset(),
    }
)

Currently, DataTree.identical() considers one and two to be identical, but the coordinate a is defined redundantly in the first tree.

For most purposes, these DataTree objects are identical, but are edge cases where it could be different, e.g., when wrting these trees to netCDF or Zarr.

It would be helpful if the .identical() checked for such discrepancies, e.g., to facilitate writing unit tests as in #9214.