SparkJiao / dpo-trajectory-reasoning

Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SparkJiao/dpo-trajectory-reasoning Issues

No issues in this repository yet.