pyvandenbussche / transformers-ner

Experiment on NER task using Huggingface state-of-the-art Transformers Natural Language Models library

Home Page:http://pyvandenbussche.info/2019/named-entity-recognition-with-pytorch-transformers/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KeyError when replace dataset

blayznik opened this issue · comments

Hi, I tried to replace the dataset with my own dataset. As my labels are different, it is resulting in a KeyError. I have already replaced the labels in the labels.txt file. Any idea how I can solve this error?

Thanks!


Traceback (most recent call last):
File "./run_ner.py", line 518, in
main()
File "./run_ner.py", line 445, in main
train_dataset = load_and_cache_examples(args, tokenizer, labels, pad_token_label_id, mode="train")
File "./run_ner.py", line 280, in load_and_cache_examples
pad_token_label_id=pad_token_label_id
File "/content/drive/My Drive/transformers-ner-master/utils_ner.py", line 121, in convert_examples_to_features
label_ids.extend([label_map[label]] + [pad_token_label_id] * (len(word_tokens) - 1))
KeyError: 'B-TAXY'

Hi, did you specify the path to the labels.txt file using the command argument --labels?

Yes, i did.

python ./run_ner.py --data_dir ./data --model_type bert --model_name_or_path bert-base-cased --output_dir ./output --labels ./data/labels.txt --do_train --do_predict --max_seq_length 256 --overwrite_output_dir --overwrite_cache

ok. In the run_ner.py file, after the line 418 labels = get_labels(args.labels), can you print the labels and make sure you have all your labels?

Hey, I've managed to solve the error already. Thanks for your help anyway!