- Tensorflow reimplementation of ACL 2017 paper "An Unsupervised Neural Attention Model for Aspect Extraction"(pdf) for practice.
-
python=3.6
-
numpy==1.16.2
-
nltk==3.3
-
tensorflow_gpu==1.8.0
-
tqdm
-
matplotlib
|
├── main.py # Training & evaluation main script. Containing Hyperparameters.
├── model.py # Model
├── dataset.py # Batching
├── preprocess.py # Preprocess the raw dataset & Serialize into binary file.
├── utils.py # Utility functions
├── data/
├── model/
- Download the unpreprocessed review dataset from author's (github).
- Download the pretrained Glove word embedding (Glove)
mkdir data
# Decompress the unpreprocessed review dataset into here.
# Put in the 'glove.6B.200d.txt' file here.
python preprocess.py --dataset=[restaurant, beer]
python main.py --mode=train
python main.py --mode=test
- Training and evaluation is based on restaurant review corpus (Citysearch corpus) only.
- Coherence Score (along with K)
K | Coherence Score |
---|---|
5 | -7.3815 |
- Representative Words (sorted)
Aspect ID | Words |
---|---|
1 | lombardis dissapointing coffe flautas geido |
2 | recomment bannana loungy arugala bottomless |
3 | cheescake veniero saganaki trully ideya |
4 | wondee disapointment bernaise housemade curtious |
5 | 30pm 00pm deliscious omlettes goal |
6 | pleasent carnivorous brushetta bouterin servce |
7 | 30pm atmostphere shortribs suace cannolis |
8 | margharitas prixe amzing gnudi chikalicious |
9 | imho overated poetry genre |
10 | parmesean fusia accomadating molyvos tabouleh |
11 | kababs octupus shortribs foccacia higly |
12 | kittichai markjoseph aweful oversalted soccer |
13 | barmarche ofcourse sauted waitperson negimaki |
14 | waittress peices phenominal ramblas sandwhich |
15 | moqueca pampano perbacco absolutly dissappointing |
- Aspect identification evaluation metric based on labeled dataset
- Hyperparameter tuning
- Aspect embedding matrix initialization with k-means algorithm like original paper
- NaN issue at 35k step