yihedeng9

Yihe Deng's repositories

A brief and partial summary of RLHF algorithms.

OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement

Language:Python10800

Enhancing Large Vision Language Models with Self-Training on Image Comprehension.

Language:PythonApache-2.070 2 3

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Language:PythonApache-2.018 1 2

Official repo of Respond-and-Respond: data, code, and evaluation

Language:PythonMIT100

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

Language:PythonApache-2.0000

A curated list of trustworthy deep learning papers. Daily updating...

MIT000

010

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonApache-2.0000