timoschick / pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Home Page:https://arxiv.org/abs/2001.07676

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OSError: Model name 'clue/albert_chinese_tiny' was not found in tokenizers model name list

hjing100 opened this issue · comments

运行:
python3 cli.py
--method pet
--pattern_ids 0
--data_dir /home/123456/projects/prompt/pet-master/generate_data/
--model_type albert
--model_name_or_path clue/albert_chinese_tiny
--task_name porn-task
--output_dir /home/123456/projects/prompt/pet-master/porn-output/
--do_train
--do_eval
--pet_per_gpu_train_batch_size 2
--pet_gradient_accumulation_steps 8
--pet_max_steps 250
--sc_per_gpu_unlabeled_batch_size 2
--sc_gradient_accumulation_steps 8
--sc_max_steps 100
报错:
OSError: Model name 'clue/albert_chinese_tiny' was not found in tokenizers model name list (albert-base-v1, albert-large-v1, albert-xlarge-v1, albert-xxlarge-v1, albert-base-v2, albert-large-v2, albert-xlarge-v2, albert-xxlarge-v2). We assumed 'clue/albert_chinese_tiny' was a path, a model identifier, or url to a directory containing vocabulary files named ['spiece.model'] but couldn't find such vocabulary files at this path or url.
请问在哪里可以看到可以使用的所有模型?
可以用于中文语料的模型训练和使用吗?

换模型xlm-roberta-base
报错:
Traceback (most recent call last):
File "cli.py", line 282, in
main()
File "cli.py", line 263, in main
no_distillation=args.no_distillation, seed=args.seed)
File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 249, in train_pet
save_unlabeled_logits=not no_distillation, seed=seed)
File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 341, in train_pet_ensemble
wrapper = init_model(model_config)
File "/home/123456/projects/prompt/pet-master/pet/modeling.py", line 146, in init_model
model = TransformerModelWrapper(config)
File "/home/123456/projects/prompt/pet-master/pet/wrapper.py", line 151, in init
cache_dir=config.cache_dir if config.cache_dir else None) # type: PreTrainedTokenizer
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1140, in from_pretrained
return cls._from_pretrained(*inputs, **kwargs)
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1287, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_roberta.py", line 171, in init
**kwargs,
File "/home/123456/.conda/envs/python36/lib/python3.6/site-packages/transformers/tokenization_gpt2.py", line 167, in init
with open(vocab_file, encoding="utf-8") as vocab_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType

升级python包transformers==4.3.0,问题解决