Error in BART Monolingual Pre-training.

Question

Error in BART Monolingual Pre-training.

Ab26 opened this issue 2 years ago · comments

I am getting the following error while training on the monolingual (Hindi) corpus. I successfully trained the tokenizer on the same corpus using create_autotokenizer.sh.

Error Logs:
Shuffling corpus!
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Keyword arguments {'sample': False, 'nbest': 64, 'alpha_or_dropout': 0.1} not recognized.
Saving the model
Loading from checkpoint
Traceback (most recent call last):
File "pretrain_nmt.py", line 989, in
run_demo()
File "pretrain_nmt.py", line 986, in run_demo
mp.spawn(model_create_load_run_save, nprocs=args.gpus, args=(args,files,train_files,)) #
File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/workspace/data/yanmtt/pretrain_nmt.py", line 530, in model_create_load_run_save
mod_compute = model(input_ids=input_ids, attention_mask=input_masks, decoder_input_ids=decoder_input_ids, output_hidden_states=args.distillation, output_attentions=args.distillation, label_mask=label_mask if args.num_domains_for_domain_classifier > 1 else None) ## Run the model and get logits.
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0])
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'label_mask'

Raj Dabre · Answer 1 · Wed Nov 23 2022 09:18:16 GMT+0800 (China Standard Time)

Hi,
What's your training command?

Also it looks like you are using python 3.8 where I have specified 3.6.8. Please ensure that you have followed the installation instructions properly. Especially the part where you should be using the version of transformers I've provided with this repo. If you use the official version things will not work.

Abhishek Bhandari · Answer 2 · Wed Nov 23 2022 18:53:03 GMT+0800 (China Standard Time)

Hello sir,

My training command is:
python3 pretrain_nmt.py -n 1 -nr 0 -g 1 --model_path models_2/mbart_model --tokenizer_name_or_path mbart-hi
--langs hi --mono_src train.hi --encoder_layers 1 --decoder_layers 1 --encoder_attention_heads=1
--decoder_attention_heads=1 --encoder_ffn_dim=128 --decoder_ffn_dim=128 --d_model=64 --shard_files

I am using dgx-2 server for running this code and all the packages with same version as specified in requirement.txt file (only exception being usage of python 3.8).

Raj Dabre · Answer 3 · Wed Nov 23 2022 19:01:01 GMT+0800 (China Standard Time)

The command looks ok but it's likely that you haven't uninstalled a previous installation of transformers.

I recommend doing a pip uninstall of all transformers versions you see via "pip freeze" and then install the transformers provided with yanmtt.

Abhishek Bhandari · Answer 4 · Thu Nov 24 2022 16:48:51 GMT+0800 (China Standard Time)

Thank you, sir. The requirements.txt didn't have the transformers version specification (I didn't see that ), so it ran on the latest available version. After changing it to the required transformers version, it is working.

Raj Dabre · Answer 5 · Thu Nov 24 2022 17:22:27 GMT+0800 (China Standard Time)

Excellent.