Please Disable GPU usage in main.py
if needed.
- Queryable Variational Autoencoder(IFVAE)
- Variational Autoencoder for Collaborative Filtering(VAE-CF)
- Collaborative Metric Learning(CML)
- Auto-encoder Recommender(AutoRec)
- Collaborative Denoising Auto-Encoders(CDAE)
- Weighted Regularized Matrix Factorization(WRMF)
- Pure SVD Recommender(PureSVD)
- Bayesian Personalized Ranking(BPR)
- Popularity
- Movielens 1M,
- Movielens 20M,
- Yahoo 1R,
- Netflix,
- Amazon Prize
Data is not suit to submit on github, so please prepare it yourself. It should be numpy npy file directly dumped from csr sparse matrix. It should be easy..
The above algorithm could be splitted into two major category based on the distance
measurement: Euclidean or Cosine. CML is a euclidean distance recommender. And, ALS
is a typical Cosine distance recommender. When doing evaluation, please select
similarity measurement before running with --similarity Euclidean
python main.py -d datax/ -m VAE-CF -i 200 -l 0.0000001 -r 100
Split data in experiment setting, and tune hyper parameters based on yaml files in config
folder
python getmovielens.py --implicit -r 0.5,0.2,0.3 -d datax/ -n ml-1m/ratings.csv
python tune_parameters.py -d datax/ -n movielens1m/autorec.csv -y config/autorec.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/bpr.csv -y config/bpr.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/cdae.csv -y config/cdae.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/cml.csv -y config/cml.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/ifvae.csv -y config/ifvae.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/vae.csv -y config/vae.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/wrmf.csv -y config/wrmf.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/puresvd.csv -y config/puresvd.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/nceplrec.csv -y config/nceplrec.yml -gpu
python tune_parameters.py -d datax/ -n movielens1m/plrec.csv -y config/plrec.yml -gpu
Resplit data into two datasets: one for train, one for test. Note the train dataset includes validation set in previous split
python getmovielens.py --implicit -r 0.7,0.3,0.0 -d datax/ -n ml-1m/ratings.csv
python reproduce_paper_results.py -p tables/movielens1m -d datax/ -v Rvalid.npz -n movielens1m_test_result.csv -gpu
python reproduce_paper_results.py