chenyangh / CMPUT-651-UofA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CMPUT-651-UofA

Data preparation

Download IMDB data from kaggle , unzip it and it will have the following structure. Assume aclImdb is under the root directory of this project.

aclImdb
    |-- train
        |-- pos
        |-- neg
        |-- unsup
    |-- test
        |-- pos
        |-- neg

  1. First build the dataset for training by python build_imdb_dataset.py
  2. For logistic regression, to train: python trainer.py, the best val epoch and test accuracy will be shown.
  3. For One hidden layer NN, to train: python trainer_nn.py, the best val epoch and test accuracy will be shown.
  4. Then draw the learning curves for both of the models. Simply run python draw.py, it requires two binary files generated by the previous step.

Logistic Regression

Learning curve Loss

Curve

Learning curve Acc

Curve

Result

Best model at: 12 th epoch, given val set

Test accuracy is: 0.851000

One Hidden Layer NN

Learning curve Loss

Curve

Learning curve Acc

Curve

Result

Best model at: 13 th epoch, given val set

Test accuracy is: 0.843000

About


Languages

Language:Python 100.0%