wenhaofang / BertForChID

A repository for Idiom NER and Idiom Cloze, using BERT model and ChID dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

This is a repository for Idiom NER and Idiom Cloze

referring to ChID: A Large-scale Chinese IDiom Dataset for Cloze Test.

Data Process

  1. Download ChID dataset into data/chid folder from here

    including train_data.txt, dev_data.txt, test_data.txt files

  2. Download bert-base-chinese model into data/bert folder from here

    including config.json, vocab.txt, pytorch_model.bin files

Main Process

  • Task One: Idiom NER
python main1.py --name NER
  • Task Two: Idiom Cloze
python main2.py --name Cloze

You can modify the configuration through command line parameters or parser.py

About

A repository for Idiom NER and Idiom Cloze, using BERT model and ChID dataset.


Languages

Language:Python 100.0%