omni-us / jsonargparse

Implement minimal boilerplate CLIs derived from type hints and parse from command line, config files and environment variables

Home Page:https://jsonargparse.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What's the best way to do backwards compatibility for existing configs?

carmocca opened this issue · comments

If I have a CLI implementation (before.py) with a Foo.a argument

class Foo:
    def __init__(self, a=2):
        ...

def fn(foo: Foo = Foo()):
    ...

from jsonargparse import ArgumentParser, ActionConfigFile

parser = ArgumentParser()
parser.add_argument("-c", "--config", action=ActionConfigFile)
parser.add_function_arguments(fn)
args = parser.parse_args()

python before.py --print_config > a.yaml

Where a.yaml is:

foo:
  class_path: __main__.Foo
  init_args:
    a: 2

And then create an after.py file whose only difference is renaming a to b

$ diff before.py after.py 
2c2
<     def __init__(self, a=2):
---
>     def __init__(self, b=2):

What's the best or recommended way to support loading a.yaml with after.py, where a is remapped to b? It currently fails as expected with:

$ python after.py --config a.yaml
usage: after.py [-h] [-c CONFIG] [--print_config[=flags]] [--foo.help CLASS_PATH_OR_NAME] [--foo FOO]
error: Parser key "foo":
  Problem with given class_path '__main__.Foo':
    Validation failed: No action for key "a" to check its value.

Context

This is currently blocking a rename in Lightning-AI/litgpt#1156

My ideal solution would be to keep the feature of having a sync between the source code and the CLIs, and not make up something new specific for CLIs. That is, it would be nice to adopt some deprecated standard, so that deprecated decorators are added to the code, and based on that jsonargparse runs additional logic. For example if a parameter a is renamed to b, then jsonargparse would accept a, and if an input a is given, then when serializing, the a would be converted to b. The moment that the deprecated decorators are removed, then automatically the logic deprecation logic in the CLI is removed as well. Unfortunately, from what I know, there isn't such a deprecation standard to adopt. There is PEP 702, but that does not include deprecation of function parameters.

In projects that I have worked on, there are three approaches we have taken, though they haven't used specific features of jsonargparse.

  1. Have migration scripts which get run when the software is upgraded to a new version. This means, manually writing a script that converts config files from the old structure to the new one and written to disk.
  2. Have on the fly migrations, which get run every time, so old configs are supported but not persisted.
  3. The third approach which is specific to parameters is to add the new parameter name to the signature and also **kwargs. Then inside do old_name_value = kwargs.pop("old_name"). And probably implement some code so that if old_name is used, a deprecation warning is printed. The point of **kwargs is just to prevent old parameter names showing up in the docs. Probably this is not a good solution because when serializing a config, both the old an new names would be included.

What to recommend, I am not sure. But we can brainstorm further.

2) is what I would prefer. But how does one integrate it into the parser? Is there an extension point in ActionConfigPath so that an arbitrary transformation can be applied on the config file before it's validated against the parser types?

  1. is what I would prefer. But how does one integrate it into the parser? Is there an extension point in ActionConfigPath so that an arbitrary transformation can be applied on the config file before it's validated against the parser types?

Currently there isn't such a feature. But note that it isn't just ActionConfigFile. People could provide from the command line --old_name=value. Also old_name could be nested in a group which can be provided by the user in a subconfig, which is not handled by ActionConfigFile. The migration function would need to handle all cases: getting a full config, getting a subconfig, getting a single value. If there are subcommands, then it is one more case to handle.

Actually, there is a feature that could be used, though it wasn't intended for this: custom-loaders.

You can implement a custom loader. The default yaml_load has some logic which avoids weird behaviors of pyyaml. Your custom loader can simply call the default one, and then add logic on top. A caveat is that yaml_load is not part of the public API.

All parsed values go through the loader. So depending on what is loaded, the function would need to decide if a transformation is required.

We'll try this and report back. Only handling configs is acceptable at the moment since it's how we recommend that people interact with the program.

With #543 now it is possible to get the default yaml loader.

Thank you Mauricio. I won't be able to check this with my original script. Feel free to close the issue if the current implementation is good enough.

I didn't expose the loader particularly for this issue. But if you aren't going to look into this further, then yes I guess we can close the issue.