NVIDIA / earth2mip

Version

source - main

On which installation method(s) does this occur?

Source

Describe the issue

When the channels are perturbed, they are multiplied by scale.

https://github.com/NVIDIA/earth2mip/blob/main/earth2mip/inference_ensemble.py#L246-L253

However, scale is determined from the values in this file, not from the scales stored in the model Inference

https://github.com/NVIDIA/earth2mip/blob/main/earth2mip/_channel_stds.py

I use a dataset that uses q, not r. Therefore, by the logic specified in code segment above, all of the q variable perturbations are set to 0 because q is not in the _channels_stds.py file. This wasn't my intended behavior: I intended to perturb q.

Three possible solutions:

We delete the _channel_stds.py and instead use the model.scale in the perturbation. The perturb method has access to the model. Inference models have a scale attribute.
If we keep the current logic, maybe we could add a logger warning that says "X channel is not perturbed". As more models get trained, I think it's likely that more variables get added (e.g. more vertical levels). Since the scales are being drawn from a separate file, not from the scales.npy file in the model package, I think an alert could be useful.
I could just add the q scales to _channel_stds.py. But I would still recommend (2) above as well.

Environment details

I am running from source.  I currently see 86b11fe.

We delete the _channel_stds.py and instead use the model.scale in the perturbation.

Not all models have .scale. It is not part of the TimeLoop interface defined here:

earth2mip/earth2mip/time_loop.py

Lines 43 to 44 in 86b11fe

    
           class TimeLoop(Protocol): 
        
               """Abstract protocol that a custom time loop must follow

This is why we took steps to decouple the initialization from the model. There are also many potential ways to scale perturbations e.g. scale by climatological variance, and these should be applicable across models (even ones which don't have .scale).

Overall these perturbation methods could use an overhaul and refactor to a more modular design. e.g. one class per method. My solution would be to ask ChatGPT "please refactor this if-statement with many clauses to one class per if statement, this class should take a dictionary of channel scales as an argument to its constructor".

For now, (2) and (3) above would be nice.

	class TimeLoop(Protocol):
	"""Abstract protocol that a custom time loop must follow

🐛[BUG][Feature Request]: Perturbing channels that are not included in `earth2mip/_channel_stds.py`

Version

On which installation method(s) does this occur?

Describe the issue

Environment details