shunk031 / human-attention-map-for-text-classification

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Human Attention for Text Classification

ACL2020 2020.acl-main.419 Code style: black Powered by AllenNLP

Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020).

Install requirements

$ poetry install

Download and Split Yelp dataset

Download from Yelp.com

Split the dataset

  • The Yelp dataset is so large that it is divided into subsets in advance.
    • After that, we can get tng.jsonl, val.jsonl, and tst.jsonl from data directory.
$ allennlp split-dataset \
    --input-file data/yelp_academic_dataset_review.json \
    --output-dir data/ \
    --tng-ratio 0.8 \
    --val-ratio 0.1 \
    --tst_ratio 0.1

Preprocess HAM dataset

$ allennlp preprocess-ham-dataset \
    --ham-dataset-dir data/ham-dataset/raw_data/ \
    --output-dir data/

Train RNN model

$ CUDA_VISIBLE_DEVICES=0 allennlp train config/base.jsonnet -s outputs -o '{"trainer": {"cuda_device": 0}}'

Reference

  • Sen, Cansu, et al. "Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words?." Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.

About

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`


Languages

Language:Python 89.6%Language:Jsonnet 10.4%