LightR0 / MultiHeadJointEntityRelationExtraction_simple

实体关系抽取,使用了百度比赛的数据集。使用pytorch实现MultiHeadJointEntityRelationExtraction,包含Bert、Albert、gru的使用,并且添加了对抗训练。最后使用Flask和Neo4j图数据库对模型进行了部署

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MultiHeadJointEntityRelationExtraction

PyTorch code for MultiHead:"Joint entity recognition and relation extraction as a multi-head selection problem".I use Chinese dataset to achieve chinese entity relation extraction. I use albert as feature extractor to improve results. For a description of the model and experiment, see paper https://arxiv.org/abs/1804.07847.

image

Requirements

  • torch==1.4.0+cu100
  • cuda=10.0
  • cudnn=7603
  • pytorch-crf==0.7.2
  • transformers==4.3.3
  • tqdm==4.59.0
  • seqeval==0.0.10
  • tensorboard
  • flask == 1.1.2

Framework

I use flask as the back-end framework, Neo4j as the graph database.

Dataset

Baidu CCKS2019 Competition

The dataset of train_data.json, dev_data.json and predict.json mentioned in the project have been upload in the compressed package data.zip in directory data/

(The dataset is a little different from I used in the project, please use the data/data.zip)Download link: https://ai.baidu.com/broad/download?dataset=dureader

Code Description

  • checkpoint  save the model regularly
  • data  save the data
  • data_loader  the code to save data process
  • deploy  demo code
  • deploy_flask  use flask web framework to deploy model
  • doc  save description files
  • mains   training code
  • models  save the model files
  • modules  the code of model
  • pretrained  save bert and albert pretrained files
  • record  save files about tensorboard
  • test  save test files while developing
  • utils  tool files

Train

cd mains
python3 trainer_std.py -encode=albert

Demo

cd deploy
python3 demo.py -encode=albert
(please use albert33m-p0.77f0.77n2.98r2.17.pth or 
43m-p0.89f0.91n0.94r0.43.pth)

If you want to use the model of gru, please train it yourself,
because I can't find the model file now.
In fact, albert performsbetter than gru.

Deployment

cd deploy_flask
python3 manage.py

Results: Input text and output relation extraction results. image Input the name of the entity and output all the knowledge related to the entity. image Input the name of the relationship and output all the knowledge with the relationship. image

About

实体关系抽取,使用了百度比赛的数据集。使用pytorch实现MultiHeadJointEntityRelationExtraction,包含Bert、Albert、gru的使用,并且添加了对抗训练。最后使用Flask和Neo4j图数据库对模型进行了部署


Languages

Language:Python 89.9%Language:HTML 8.9%Language:CSS 1.2%