beaupranisaa / truncation-vs-extractive-summarization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Evaluating Truncation and Extractive Approaches in Text Summarization

Create dataset

python3 shuffle_dataset.py --dataset 'xsum' --orig_source_length 512 --max_target_length 36 --seed 0

Extract document

Unshuffled dataset

python3 src/extraction.py --approach 'head+tail0.5'

Shuffle dataset

python3 src/extraction.py --approach 'head+tail0.5' --shuffle True -seed 0

To combine the extracted document into train_set.csv, val_set.csv and test.csv

Unshuffled dataset

python3 helper/helper.py --approach 'head+tail0.5'

Shuffle dataset

python3 helper/helper.py --approach 'head+tail0.5' --shuffle True -seed 0

About


Languages

Language:Python 51.1%Language:Jupyter Notebook 48.8%Language:Dockerfile 0.1%