Tensorflow implementation of ICLR 2018 paper A new method of region embedding for text classification.
Python (verified on 2.7.13)
tensorflow(verified on 1.0)
We use publicly available datasets from Zhang et al.(2015) to evaluate our models. The datasets can be obtained from here.
First, download the datasets and place them in data
directory.
Second, pre-process the datasets:
sh run.sh preprocess $data_dir
To ensure the reproducibility of the experiment, we provide detailed configs binding corresponding dataset. Specify the target dataset config and run:
Dataset | Command |
---|---|
Yelp Polarity. | sh run.sh train conf yelp.p.model.config |
Yelp Full. | sh run.sh train conf/yelp.full.model.config |
Amazon Polarity. | sh run.sh train conf/amazon.p.model.config |
Amazon Full. | sh run.sh train conf/amazon.full.model.config |
Ag news. | sh run.sh train conf/ag_news.model.config |
Sogou. | sh run.sh train conf/sogou.model.conf |
Yahoo Answer. | sh run.sh train conf/yahoo.answer.model.conf |
DBPedia. | sh run.sh train conf/dbpedia.model.config |
We provide the exploratory method involved in the paper if readers are interesed in reproducing them. Readers can specific the mode setting in configure to run different expriments:
Mode | Experiments |
---|---|
WC | Word-Context |
CW | Context-Word |
win_pool | FastText(Win-pool) |
scalar | Scalar version of W.C.region.emb |
multi_region | Multi-region version of W.C.region.emb |
We have placed some example configs for the exploratory experiments on Yelp.Full. You can just run folowing comands to try them:
Experiments | Command |
---|---|
Multi-region version of W.C.region.emb | sh run.sh train conf/yelp.full.multi-region.model.config |
Scalar version of W.C.region.emb | sh run.sh train conf/yelp.full.scalar.model.config |
FastText(Win-pool) | sh run.sh train conf/yelp.full.winpool.model.config |