ZHZisZZ / modpo

[Findings of ACL'2024] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization.

Home Page:https://arxiv.org/abs/2310.03708

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MODPO: Multi-Objective Direct Preference Optimization

Code release for Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization.

TL;DR: Compared to DPO loss, MODPO loss includes a margin to steer language models by multiple objectives.

Installation

conda create -n modpo python=3.10
conda activate modpo
pip install torch==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
# (optional) pip install flash-attn==2.3.2 --no-build-isolation

Running MODPO

This repository includes two MODPO examples:

Other examples

This repository also contains other off-the-shelf tuning recipes:

To implement new alignment algorithms, please add new trainers at src/trainer.

Customized datasets

For supported datasets, refer to REAL_DATASET_CONFIGS(src/data/configs.py). To train on your datasets, add them under src/data/raw_data and modify REAL_DATASET_CONFIGS(src/data/configs.py) accordingly. Please see src/data/raw_data/shp for an example.

Reference

@misc{zhou2023onepreferencefitsall,
      title={Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization}, 
      author={Zhanhui Zhou and Jie Liu and Chao Yang and Jing Shao and Yu Liu and Xiangyu Yue and Wanli Ouyang and Yu Qiao},
      year={2023},
      eprint={2310.03708},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

About

[Findings of ACL'2024] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization.

https://arxiv.org/abs/2310.03708


Languages

Language:Python 93.4%Language:Shell 6.6%