google-research / neuralgcm

Hybrid ML + physics model of the Earth's atmosphere

Home Page:https://neuralgcm.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

land-sea mask

yuliang954 opened this issue · comments

Hi NeuralGCM team,

I wonder how the land-sea mask is specified in NeuralGCM. For example, in the 1.4-deg deterministic model, is it based on the NaN values after regridding the 0.25-deg resolution ERA5 SST using xarray_utils.regrid?

I noticed that the very first inference demo yielded quite different locations of NaN values after regridding from the current demo. The current demo has more ocean (i.e., no-NaNs) area. Maybe you changed xarray_utils.regrid a little?

Thanks!
Yu

Our trained models used regridding as shown in the current online docs/demo notebook.

The original demo notebook was using a mistaken setting (horizontal_interpolation.ConservativeRegridder with skipna=False) which resulted in NaN values on coastlines, that needed to be filled with nearest neighbors values.

I wrote a detailed guide to regridding here. Please take a look and let me know if you still have questions.

This information is very helpful, Shoyer! I have a question about the guide where you mentioned that "NeuralGCM’s surface model also includes a mask that ignores values over land." After regridding the ERA5 data to the targeted Gaussian grid, do the land values or locations that will be ignored correspond to NaNs?

I have a question about the guide where you mentioned that "NeuralGCM’s surface model also includes a mask that ignores values over land." After regridding the ERA5 data to the targeted Gaussian grid, do the land values or locations that will be ignored correspond to NaNs?

In principle, the only locations that take sea surface temperature input account are locations where the land/sea mask is less than 1, so the details of the NaN filling should not matter.

In practice, the model does seem to make slightly different predictions when NaN values are filled in an inconsistent fashion. I don't think we ever quite tracked down why this is the case -- possibly there are some inconsistencies in because ERA5's land/sea mask and SST or sea ice fields.

I am asking because I am trying to couple NeuralGCM with a statistical SST model, and I need to know the exact ocean grids from which NeuralGCM reads SST information.

So, after regridding the 0.25-degree ERA5 SST data to the targeted Gaussian grid, are the grids with NaN values actually a subset of the land grids? In other words, do the grids with non-NaN values include all ocean grids and some coastal land grids?

If this is the case, I can provide NeuralGCM with SST on all grids with non-NaN values, since NeuralGCM won't read the coastal land grids anyway.

So, after regridding the 0.25-degree ERA5 SST data to the targeted Gaussian grid, are the grids with NaN values actually a subset of the land grids? In other words, do the grids with non-NaN values include all ocean grids and some coastal land grids?

Yes, this is correct.

Thanks!