ChangyuChen347's repositories
Language:Python000
MaskedThought
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
000
semi-offline-RL
Semi-Offline Reinforcement Learning for Optimized Text Generation
RL4LM
A modular RL library to fine-tune language models to human preferences
Language:PythonApache-2.0000