Tanh-wink / Chid_Bert_baseline

A based-bert baseline for Chinese idiom cloze test with pytorch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chid_Bert_baseline

A based-bert baseline for Chinese idiom cloze test with pytorch.

Chinese Idiom Reading Comprehension Competition

the competition official website

paper

The ChID Dataset for paper ChID: A Large-scale Chinese IDiom Dataset for Cloze Test.

official baseline and paper code

ChID-Dataset

this baseline code

use transformers and pytorch implement based-bert for chinese idiom cloze test

requirements

pyhton3.6
torch=1.1.0
transformers==2.8.0
scikit-learn==0.22.2.post1
pandas==1.0.3
tqdm==4.45.0

dataset Download

Chid dataset download
save chid data into ./data you maybe need a vpn

pretrain model

download
For this baseline, we use chinese_wwm_pytorch as pretrain model save chid data into ./pretrained_models

About

A based-bert baseline for Chinese idiom cloze test with pytorch.


Languages

Language:Python 100.0%