yul091 / DENRL

Distantly-Supervised Joint Entity and Relation Extraction with Noise-Robust Learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DENRL

Codebase for the ACL 2024 Findings paper: "Distantly-Supervised Joint Extraction with Noise-Robust Learning" (PDF)

Quick Start

  • Python 3.8+

  • Install requirements: pip install -r requirements.txt

  • Train and evaluate a joint extraction model with noise reduction training (with instance selection)

MODEL_PATH=gpt2-medium
TRAIN_FILE="path/to/your/data"
VAL_FILE="path/to/your/data"
OUTPUT_DIR="results"

python run_jointmodel.py \
    --model_name_or_path $MODEL_PATH --classifier_type "crf" \
    --train_file $TRAIN_FILE --validation_file $VAL_FILE \
    --output_dir $OUTPUT_DIR --do_eval --do_train \
    --evaluation_strategy epoch --load_best_model_at_end \
    --metric_for_best_model eval_f1 --greater_is_better True \
    --per_device_train_batch_size 5 --per_device_eval_batch_size 20 \
    --gradient_accumulation_steps 16 --overwrite_cache \
    --use_negative_sampling --sample_rate 0.1 --num_train_epochs 100 \
    --beta 1.0 --alpha 0.5 --boot_start_epoch 5 --threshold 0.5 
  • Train and evaluate a joint extraction model with standard training (without instance selection)
MODEL_PATH=gpt2-medium
TRAIN_FILE="path/to/your/data"
VAL_FILE="path/to/your/data"
OUTPUT_DIR="results"

python run_jointmodel.py \
    --model_name_or_path $MODEL_PATH --classifier_type "crf" \
    --train_file $TRAIN_FILE --validation_file $VAL_FILE \
    --output_dir $OUTPUT_DIR --do_eval --do_train \
    --evaluation_strategy epoch --load_best_model_at_end \
    --metric_for_best_model eval_f1 --greater_is_better True \
    --per_device_train_batch_size 5 --per_device_eval_batch_size 20 \
    --gradient_accumulation_steps 16 --overwrite_cache \
    --use_negative_sampling --sample_rate 0.1 --num_train_epochs 100 \
    --beta 1.0 --alpha 0.5 --boot_start_epoch 5 --threshold 0.5 --baseline

Citation

If you find this repo useful, please cite our paper:

@inproceedings{li2023distantly,
  title={Distantly-Supervised Joint Extraction with Noise-Robust Learning},
  author={Li, Yufei and Yu, Xiao and Guo, Yanghong and Liu, Yanchi and Chen, Haifeng and Liu, Cong},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2024},
  year={2024},
}

About

Distantly-Supervised Joint Entity and Relation Extraction with Noise-Robust Learning


Languages

Language:Python 100.0%