land-sea mask

Question

land-sea mask

yuliang954 opened this issue 2 months ago · comments

Hi NeuralGCM team,

I wonder how the land-sea mask is specified in NeuralGCM. For example, in the 1.4-deg deterministic model, is it based on the NaN values after regridding the 0.25-deg resolution ERA5 SST using xarray_utils.regrid?

I noticed that the very first inference demo yielded quite different locations of NaN values after regridding from the current demo. The current demo has more ocean (i.e., no-NaNs) area. Maybe you changed xarray_utils.regrid a little?

Thanks!
Yu

yuliang954 commented 2 months ago

Thanks!

Stephan Hoyer · Answer 1 · Sat May 18 2024 01:52:02 GMT+0800 (China Standard Time)

Our trained models used regridding as shown in the current online docs/demo notebook.

The original demo notebook was using a mistaken setting (horizontal_interpolation.ConservativeRegridder with skipna=False) which resulted in NaN values on coastlines, that needed to be filled with nearest neighbors values.

I wrote a detailed guide to regridding here. Please take a look and let me know if you still have questions.

yuliang954 · Answer 2 · Sat May 18 2024 02:38:41 GMT+0800 (China Standard Time)

This information is very helpful, Shoyer! I have a question about the guide where you mentioned that "NeuralGCM’s surface model also includes a mask that ignores values over land." After regridding the ERA5 data to the targeted Gaussian grid, do the land values or locations that will be ignored correspond to NaNs?

Stephan Hoyer · Answer 3 · Sat May 18 2024 05:46:38 GMT+0800 (China Standard Time)

I have a question about the guide where you mentioned that "NeuralGCM’s surface model also includes a mask that ignores values over land." After regridding the ERA5 data to the targeted Gaussian grid, do the land values or locations that will be ignored correspond to NaNs?

In principle, the only locations that take sea surface temperature input account are locations where the land/sea mask is less than 1, so the details of the NaN filling should not matter.

In practice, the model does seem to make slightly different predictions when NaN values are filled in an inconsistent fashion. I don't think we ever quite tracked down why this is the case -- possibly there are some inconsistencies in because ERA5's land/sea mask and SST or sea ice fields.

yuliang954 · Answer 4 · Sat May 18 2024 06:16:47 GMT+0800 (China Standard Time)

I am asking because I am trying to couple NeuralGCM with a statistical SST model, and I need to know the exact ocean grids from which NeuralGCM reads SST information.

So, after regridding the 0.25-degree ERA5 SST data to the targeted Gaussian grid, are the grids with NaN values actually a subset of the land grids? In other words, do the grids with non-NaN values include all ocean grids and some coastal land grids?

If this is the case, I can provide NeuralGCM with SST on all grids with non-NaN values, since NeuralGCM won't read the coastal land grids anyway.

Stephan Hoyer · Answer 5 · Sat May 18 2024 06:20:16 GMT+0800 (China Standard Time)

So, after regridding the 0.25-degree ERA5 SST data to the targeted Gaussian grid, are the grids with NaN values actually a subset of the land grids? In other words, do the grids with non-NaN values include all ocean grids and some coastal land grids?

Yes, this is correct.