menglin0320 / P5-UIE

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

P5-UIE

This project is mainly based on jeykigung's work access their github repo through P5 repo

Paper link: https://arxiv.org/pdf/2203.13366.pdf

Introduction

This work leverages P5, a multi-task large language model-based recommender system, as its foundation. In this study, user and item embeddings were integrated into the framework to enhance the representation of users and items. Additionally, a set of "attribute learning" tasks were introduced, enabling the training of attribute-aware embeddings. This incorporation of attributes led to improvements in the overall performance of the recommender system.

Requirements:

  • Python 3.9.7
  • PyTorch 1.10.1
  • transformers 4.2.1
  • tqdm
  • numpy
  • sentencepiece
  • pyyaml

Usage

Run the experiment

  1. Clone this repo

    git clone https://github.com/menglin0320/P5-UIE.git
    
  2. Download preprocessed data from this Google Drive link, then put them into the data folder. If you would like to preprocess your own data, please follow the jupyter notebooks in the preprocess folder. Raw data can be downloaded from this Google Drive link, then put them into the raw_data folder.

  3. Download pretrained checkpoints into snap folder. If you would like to train your own P5 models, snap folder will also be used to store P5 checkpoints.

  4. Pretrain with scripts in scripts folder, such as

    bash scripts/pretrain_P5_small_beauty.sh 4
    

    or you can run train.sh to to do end to end training

    bash scripts/train.sh 4
    

    Here 4 means using 4 GPUs to conduct parallel pretraining.

  5. Evaluate with example jupyter notebooks in the notebooks folder. Before testing, create a soft link of data folder to the notebooks folder by

    cd notebooks
    ln -s ../data .
    

Document for this project

'report'.

Pretrained Checkpoints

See CHECKPOINTS.md. A google drive link for a checkpoint for P5 with soft embedding is also added to the CHECKPOINTS mark down file.

Citation

Please cite the following paper for the original work:

@inproceedings{geng2022recommendation,
  title={Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt \& Predict Paradigm (P5)},
  author={Geng, Shijie and Liu, Shuchang and Fu, Zuohui and Ge, Yingqiang and Zhang, Yongfeng},
  booktitle={Proceedings of the Sixteenth ACM Conference on Recommender Systems},
  year={2022}
}

Acknowledgements

VL-T5, PETER, S3-Rec and P5 repo

About

License:MIT License


Languages

Language:Python 54.5%Language:Jupyter Notebook 45.2%Language:Shell 0.3%