ConfigValueError when Training smaller T5 Model according to READ.ME

Question

ConfigValueError when Training smaller T5 Model according to READ.ME

FFFiend opened this issue a year ago · comments

The error:

ConfigValueError: Unions of containers are not supported:
generation_config: Union[str, Path, GenerationConfig]
    full_key: 
    object_type=None

How to reproduce: run the train command in README

Jesse Mu · Answer 1 · Fri Jun 02 2023 11:44:21 GMT+0800 (China Standard Time)

Can you give me your omegaconf and hydra versions, and if they are older than the ones below, update them to at least the versions I have?

Python 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import omegaconf
>>> omegaconf.__version__
'2.3.0'
>>> import hydra
>>> hydra.__version__
'1.2.0'

Owais Zahid · Answer 2 · Sat Jun 03 2023 10:59:08 GMT+0800 (China Standard Time)

Yep I have those exact versions.

Jesse Mu · Answer 3 · Wed Jun 07 2023 05:18:59 GMT+0800 (China Standard Time)

Hm, I'm unable to reproduce this error. I created a conda environment and cloned the repository from scratch and things worked fine. Specifically:

$ conda create --name gist-test python=3.10
...
$ conda activate gist-test
$ git clone https://github.com/jayelm/gisting
...
$ cd gisting
$ mkdir exp .cache
$ pip install -r requirements.txt
...
Successfully installed GitPython-3.1.31 MarkupSafe-2.1.3 absl-py-1.4.0 accelerate-0.18.0 aiohttp-3.8.4 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.2 attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.1.0 click-8.1.3 cmake-3.26.3 datasets-2.10.0 deepspeed-0.8.3 dill-0.3.6 docker-pycreds-0.4.0 evaluate-0.3.0 filelock-3.12.0 fire-0.5.0 frozenlist-1.3.3 fsspec-2023.5.0 gitdb-4.0.10 hjson-3.1.0 huggingface-hub-0.15.1 hydra-core-1.2.0 idna-3.4 jinja2-3.1.2 joblib-1.2.0 lit-16.0.5.post0 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.14 networkx-3.1 ninja-1.11.1 nltk-3.6.2 numpy-1.21.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 omegaconf-2.3.0 openai-0.27.2 packaging-23.1 pandas-2.0.2 pathtools-0.1.2 promise-2.3 protobuf-4.23.2 psutil-5.9.5 py-cpuinfo-9.0.0 pyarrow-12.0.0 pydantic-1.10.8 python-dateutil-2.8.2 pytz-2023.3 pyyaml-6.0 regex-2023.6.3 requests-2.31.0 responses-0.18.0 rouge_score-0.1.2 sentencepiece-0.1.98 sentry-sdk-1.25.0 setproctitle-1.3.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.0 sympy-1.12 termcolor-2.3.0 tokenizers-0.13.3 torch-2.0.0 tqdm-4.65.0 transformers-4.28.0.dev0 triton-2.0.0 typing-extensions-4.6.3 tzdata-2023.3 urllib3-2.0.2 wandb-0.13.4 xxhash-3.2.0 yarl-1.9.2

and then

python -m src.train training.gist.num_gist_tokens=2 training.gist.condition=gist wandb.tag=yourtaghere

works fine and starts training.

You might double check the versions listed above ^ and whether there are any mismatches. The error seems to be an omegaconf error so I'm still a bit suspicious there's a version mismatch somewhere.

wu · Answer 4 · Fri Jul 07 2023 01:11:53 GMT+0800 (China Standard Time)

Yep I have those exact versions.

I meet the same error with you, have you solved it yet?

wu · Answer 5 · Fri Jul 07 2023 10:29:29 GMT+0800 (China Standard Time)

Hm, I'm unable to reproduce this error. I created a conda environment and cloned the repository from scratch and things worked fine. Specifically:

$ conda create --name gist-test python=3.10
...
$ conda activate gist-test
$ git clone https://github.com/jayelm/gisting
...
$ cd gisting
$ mkdir exp .cache
$ pip install -r requirements.txt
...
Successfully installed GitPython-3.1.31 MarkupSafe-2.1.3 absl-py-1.4.0 accelerate-0.18.0 aiohttp-3.8.4 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.2 attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.1.0 click-8.1.3 cmake-3.26.3 datasets-2.10.0 deepspeed-0.8.3 dill-0.3.6 docker-pycreds-0.4.0 evaluate-0.3.0 filelock-3.12.0 fire-0.5.0 frozenlist-1.3.3 fsspec-2023.5.0 gitdb-4.0.10 hjson-3.1.0 huggingface-hub-0.15.1 hydra-core-1.2.0 idna-3.4 jinja2-3.1.2 joblib-1.2.0 lit-16.0.5.post0 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.14 networkx-3.1 ninja-1.11.1 nltk-3.6.2 numpy-1.21.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 omegaconf-2.3.0 openai-0.27.2 packaging-23.1 pandas-2.0.2 pathtools-0.1.2 promise-2.3 protobuf-4.23.2 psutil-5.9.5 py-cpuinfo-9.0.0 pyarrow-12.0.0 pydantic-1.10.8 python-dateutil-2.8.2 pytz-2023.3 pyyaml-6.0 regex-2023.6.3 requests-2.31.0 responses-0.18.0 rouge_score-0.1.2 sentencepiece-0.1.98 sentry-sdk-1.25.0 setproctitle-1.3.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.0 sympy-1.12 termcolor-2.3.0 tokenizers-0.13.3 torch-2.0.0 tqdm-4.65.0 transformers-4.28.0.dev0 triton-2.0.0 typing-extensions-4.6.3 tzdata-2023.3 urllib3-2.0.2 wandb-0.13.4 xxhash-3.2.0 yarl-1.9.2

and then

python -m src.train training.gist.num_gist_tokens=2 training.gist.condition=gist wandb.tag=yourtaghere

works fine and starts training.

You might double check the versions listed above ^ and whether there are any mismatches. The error seems to be an omegaconf error so I'm still a bit suspicious there's a version mismatch somewhere.

I meet the same problem, and the packages you mentioned above is exactly version 2.3.0 and 1.2.0 in my enviroment, but I still can't run the code

Jesse Mu · Answer 6 · Fri Jul 07 2023 10:34:22 GMT+0800 (China Standard Time)

I meet the same problem, and the packages you mentioned above is exactly version 2.3.0 and 1.2.0 in my enviroment, but I still can't run the code

By this do you mean you created a new conda environment using python 3.10 and the steps outlined in the quoted post (conda create --name gist-test python=3.10), installed the requirements from requirements.txt, and still ran into this error?

Jesse Mu · Answer 7 · Fri Jul 07 2023 10:37:40 GMT+0800 (China Standard Time)

As a workaround you may be able to set

generation_config: Optional[str] = None

here: https://github.com/jayelm/gisting/blob/main/src/arguments.py#L137-L138

to change the type of generation_config and see if that satisfies the omegaconf typechecker.

wu · Answer 8 · Fri Jul 07 2023 10:39:51 GMT+0800 (China Standard Time)

I meet the same problem, and the packages you mentioned above is exactly version 2.3.0 and 1.2.0 in my enviroment, but I still can't run the code

By this do you mean you created a new conda environment using python 3.10 and the steps outlined in the quoted post (conda create --name gist-test python=3.10), installed the requirements from requirements.txt, and still ran into this error?

yes exactly！

wu · Answer 9 · Fri Jul 07 2023 10:43:14 GMT+0800 (China Standard Time)

sorry, I'm not sure how to change it? could you please give me a more specific instruction? thanks a lot!

Jesse Mu · Answer 10 · Fri Jul 07 2023 10:44:29 GMT+0800 (China Standard Time)

Replace the lines linked above with:

class GistSeq2SeqTrainingArguments(GistTrainingArguments, Seq2SeqTrainingArguments):
    generation_config: Optional[str] = None

and let me know if that works.

wu · Answer 11 · Fri Jul 07 2023 10:45:47 GMT+0800 (China Standard Time)

As a workaround you may be able to set

generation_config: Optional[str] = None

here: https://github.com/jayelm/gisting/blob/main/src/arguments.py#L137-L138

to change the type of generation_config and see if that satisfies the omegaconf typechecker.

do you mean that add "generation_config: Optional[str] = None" into the function you mentioned?(above or substitute the pass line)

wu · Answer 12 · Fri Jul 07 2023 10:48:46 GMT+0800 (China Standard Time)

and let me know if that works.

ok I replace the line but it doesn't work, another error happened

Jesse Mu · Answer 13 · Fri Jul 07 2023 10:49:38 GMT+0800 (China Standard Time)

ok I replace the line but it doesn't work, another error happened

can you be more specific?

wu · Answer 14 · Fri Jul 07 2023 10:50:48 GMT+0800 (China Standard Time)

the error happened in the 96 lines in the file: /site-packages/hydra/core/override_parser/overrides_parser.py
here's the info:
Exception has occurred: OverrideParseException (note: full exception trace is shown but execution is paused at: _run_module_as_main)
mismatched input '=' expecting
See https://hydra.cc/docs/next/advanced/override_grammar/basic for details
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/grammar/gen/OverrideParser.py", line 276, in override
self.match(OverrideParser.EOF)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/Parser.py", line 126, in match
t = self._errHandler.recoverInline(self)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorStrategy.py", line 407, in recoverInline
raise InputMismatchException(recognizer)
antlr4.error.Errors.InputMismatchException: None

During handling of the above exception, another exception occurred:

File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_parser.py", line 82, in parse_overrides
parsed = self.parse_rule(override, "override")
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_parser.py", line 66, in parse_rule
tree = rule()
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/grammar/gen/OverrideParser.py", line 279, in override
self._errHandler.reportError(self, re)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorStrategy.py", line 128, in reportError
self.reportInputMismatch(recognizer, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorStrategy.py", line 275, in reportInputMismatch
recognizer.notifyErrorListeners(msg, e.offendingToken, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/Parser.py", line 322, in notifyErrorListeners
listener.syntaxError(self, offendingToken, line, column, msg, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorListener.py", line 60, in syntaxError
delegate.syntaxError(recognizer, offendingSymbol, line, column, msg, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_visitor.py", line 372, in syntaxError
raise HydraException(msg) from e
hydra.errors.HydraException: mismatched input '=' expecting

During handling of the above exception, another exception occurred:

File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_parser.py", line 96, in parse_overrides
raise OverrideParseException(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 233, in _load_configuration_impl
parsed_overrides = parser.parse_overrides(overrides=overrides)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 141, in load_configuration
return self._load_configuration_impl(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 594, in compose_config
cfg = self.config_loader.load_configuration(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 105, in run
cfg = self.compose_config(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/utils.py", line 453, in
lambda: hydra.run(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/utils.py", line 216, in run_and_report
raise ex
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/utils.py", line 216, in run_and_report
raise ex
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/utils.py", line 452, in _run_app
run_and_report(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra
_run_app(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/main.py", line 90, in decorated_main
_run_hydra(
File "/data/wupf/gisting/src/train.py", line 360, in
main()
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/runpy.py", line 197, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
hydra.errors.OverrideParseException: mismatched input '=' expecting
See https://hydra.cc/docs/next/advanced/override_grammar/basic for details

Jesse Mu · Answer 15 · Fri Jul 07 2023 10:51:45 GMT+0800 (China Standard Time)

Can you give the full command you ran to produce this error? This seems to be a syntax issue with how arguments were specified in the CLI.

wu · Answer 16 · Fri Jul 07 2023 11:01:21 GMT+0800 (China Standard Time)

Can you give the full command you ran to produce this error? This seems to be a syntax issue with how arguments were specified in the CLI.

thanks for your reply, the syntax bug I mentioned above is happened when I use the debug mode in the Vscode, the configuration in launch.json is wrote as:

wu · Answer 17 · Fri Jul 07 2023 11:01:29 GMT+0800 (China Standard Time)

    {
        "name": "gist_official",
        "type": "python",
        "python": "/data/wupf/anaconda3/envs/gist/bin/python",
        "request": "launch",
        "module": "src.train",
        "console": "integratedTerminal",
        "justMyCode": false,
        "args": ["training.gist.num_gist_tokens=2 training.gist.condition=gist"],
        "env": {"CUDA_VISIBLE_DEVICES": "3"},
        "cwd": "/data/wupf/gisting"
    },

Jesse Mu · Answer 18 · Fri Jul 07 2023 11:03:19 GMT+0800 (China Standard Time)

does splitting up the two args into separate strings:

        "args": ["training.gist.num_gist_tokens=2", "training.gist.condition=gist"],

help? Or even just removing the args entirely for now to see if it runs.

wu · Answer 19 · Fri Jul 07 2023 11:03:39 GMT+0800 (China Standard Time)

Can you give the full command you ran to produce this error? This seems to be a syntax issue with how arguments were specified in the CLI.

then I direct run the command in the shell: python -m src.train training.gist.num_gist_tokens=2 training.gist.condition=gist，and another bug was happened and confused me: Error executing job with overrides: ['training.gist.num_gist_tokens=2', 'training.gist.condition=gist']
Traceback (most recent call last):
File "/data/wupf/gisting/src/train.py", line 61, in main
args: Arguments = global_setup(args)
File "/data/wupf/gisting/src/arguments.py", line 335, in global_setup
args = OmegaConf.to_object(args)
ImportError: Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

wu · Answer 20 · Fri Jul 07 2023 11:05:09 GMT+0800 (China Standard Time)

n just removing the args entirely

I'm very confused because the newest version of accelerate is 0.19.0 and I use pip install accelerate -U, the package installed is exactly 0.19.0。。。。

wu · Answer 21 · Fri Jul 07 2023 11:06:39 GMT+0800 (China Standard Time)

does splitting up the two args into separate strings:
        "args": ["training.gist.num_gist_tokens=2", "training.gist.condition=gist"],
help? Or even just removing the args entirely for now to see if it runs.

I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"

Jesse Mu · Answer 22 · Fri Jul 07 2023 11:07:47 GMT+0800 (China Standard Time)

I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"

I still suspect there is some sort of version mismatch between the env you're using and the one specified in requirements.txt, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?

wu · Answer 23 · Fri Jul 07 2023 11:10:37 GMT+0800 (China Standard Time)

, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?

luckly I reinstalled the whole process, and it seemed to fix everything！！ thanks a lot for your support, and this line "generation_config: Optional[str] = None" is very necessary！！

wu · Answer 24 · Fri Jul 07 2023 11:13:52 GMT+0800 (China Standard Time)

I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"

I still suspect there is some sort of version mismatch between the env you're using and the one specified in requirements.txt, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?

wonderful work， the intuition of "change the attention to allow the model to learn how to compress the prompt sentence" is simple and useless!!!

wu · Answer 25 · Fri Jul 07 2023 11:19:10 GMT+0800 (China Standard Time)

I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"

I still suspect there is some sort of version mismatch between the env you're using and the one specified in requirements.txt, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?

wonderful work， the intuition of "change the attention to allow the model to learn how to compress the prompt sentence" is simple and useless!!!

sorry I mean useful!! not useless。。。

Jesse Mu · Answer 26 · Fri Jul 07 2023 12:56:45 GMT+0800 (China Standard Time)

luckly I reinstalled the whole process, and it seemed to fix everything！！ thanks a lot for your support, and this line "generation_config: Optional[str] = None" is very necessary！！

Glad to hear it's working, and thanks!!

I think this workaround (generation_config: Optional[str] = None) should fix OP's issue as well, so I'll close this issue.