Queryable Variational Auto-encoder for Recommendation

Please Disable GPU usage in main.py if needed.

Algorithm Implemented

Queryable Variational Autoencoder(IFVAE)
Variational Autoencoder for Collaborative Filtering(VAE-CF)
Collaborative Metric Learning(CML)
Auto-encoder Recommender(AutoRec)
Collaborative Denoising Auto-Encoders(CDAE)
Weighted Regularized Matrix Factorization(WRMF)
Pure SVD Recommender(PureSVD)
Bayesian Personalized Ranking(BPR)
Popularity

Data

Movielens 1M,
Movielens 20M,
Yahoo 1R,
Netflix,
Amazon Prize

Data is not suit to submit on github, so please prepare it yourself. It should be numpy npy file directly dumped from csr sparse matrix. It should be easy..

Measure

The above algorithm could be splitted into two major category based on the distance measurement: Euclidean or Cosine. CML is a euclidean distance recommender. And, ALS is a typical Cosine distance recommender. When doing evaluation, please select similarity measurement before running with --similarity Euclidean

Example Commands

Single Run

python main.py -d datax/ -m VAE-CF -i 200 -l 0.0000001 -r 100

Hyper-parameter Tuning and Paper Result Reproduction

Split data in experiment setting, and tune hyper parameters based on yaml files in config folder

python getmovielens.py --implicit -r 0.5,0.2,0.3 -d datax/ -n ml-1m/ratings.csv
python tune_parameters.py -d datax/ -n movielens1m/autorec.csv -y config/autorec.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/bpr.csv -y config/bpr.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/cdae.csv -y config/cdae.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/cml.csv -y config/cml.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/ifvae.csv -y config/ifvae.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/vae.csv -y config/vae.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/wrmf.csv -y config/wrmf.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/puresvd.csv -y config/puresvd.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/nceplrec.csv -y config/nceplrec.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/plrec.csv -y config/plrec.yml -gpu

Resplit data into two datasets: one for train, one for test. Note the train dataset includes validation set in previous split

python getmovielens.py --implicit -r 0.7,0.3,0.0 -d datax/ -n ml-1m/ratings.csv
python reproduce_paper_results.py -p tables/movielens1m -d datax/ -v Rvalid.npz -n movielens1m_test_result.csv -gpu
python reproduce_paper_results.py

wuga214 / Q-VAE-Recommendation