princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

An error when max_seq_length is set too long

Madilynalisa opened this issue · comments

When I run evaluation.py, there is an error message:

This only happens when max_seq_length is set too long (more than 128), and I can run evaluation.py successfully if max_seq_length is set to 32.

Traceback (most recent call last):
File "evaluation.py", line 136, in
main()
File "evaluation.py", line 49, in main
tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 385, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1768, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1782, in _from_pretrained
slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained(
File "/root/miniconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1841, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/roberta/tokenization_roberta.py", line 159, in init
super().init(
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/gpt2/tokenization_gpt2.py", line 176, in init
with open(vocab_file, encoding="utf-8") as vocab_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType

This is the parameter setting when running evaluation.py fails.

python train.py
--model_name_or_path model/chinese-roberta-wwm-ext
--train_file data/abstract_less128.txt
--output_dir result/abstract_less128
--num_train_epochs 1
--per_device_train_batch_size 64
--learning_rate 1e-5
--max_seq_length 128
--evaluation_strategy steps
--metric_for_best_model stsb_spearman
--load_best_model_at_end
--eval_steps 125
--pooler_type cls
--mlp_only_train
--overwrite_output_dir
--temp 0.05
--do_train
--do_eval
--fp16
--dropout 0.1
--neg_size 160
--dup_type bpe
--dup_rate 0.3
--momentum 0.995

Stale issue message