nijianmo/recsys_justification

recsy justification

This is the code for our EMNLP 19' work

Justifying recommendations using distantly-labeled reviews and fined-grained aspects, Jianmo Ni, Jiacheng Li, Julian McAuley, Empirical Methods in Natural Language Processing (EMNLP) 2019.

This repo follows the following hierarchy:

recsys_justification
|---justitication_classifier
|---reference2seq
|---acmlm

Newly released Amazon product review dataset.

We have released a new version of the Amazon review dataset which includes more and newer reviews (i.e. reviews in the range of 2014~2018)! Welcome to play with the dataset and do interesting research!

justification classifier

This is the fine-tuned BERT model that used to train on the labeled justification data. You can simply train the model via run.sh and conduct inference over any unlabeled data using predict.sh, after you change the data loader correspondingly in the python file. We also provide a pre-trained model here. - bert_config.json. - pytorch_model.bin.

reference2seq

This is the proposed reference2seq model. It contains files for data processing and model training/evaluation.

acmlm

This is the proposed aspect-conditional masked language model (acmlm).

Data

2000 labeled data that includes a binary label for each element discourse unit (EDU) in reviews. You can find it under justification_classifier.
Distantly labeled dataset derived from the Yelp and Amazon Clothing dataset. Each line of the json file includes an EDU from a review and the fine-grained aspects convered in it.
- Download Yelp
- Download Amazon Clothing

Requirements

PyTorch=0.4
pytorch-pretrained-bert

Please cite our paper if you find the data and code helpful, thanks!

@inproceedings{Ni2019RecsysJust
  title={Justifying recommendations using distantly-labeled reviews and fined-grained aspects},
  author={Jianmo Ni and Jiacheng Li and Julian McAuley},
  booktitle={EMNLP},
  year={2019}
}

nijianmo / recsys_justification