Sen ZHANG's starred repositories

LLaVA-UHD

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Language:PythonStargazers:218Issues:0Issues:0

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonLicense:Apache-2.0Stargazers:2937Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:17170Issues:0Issues:0

X2-VLM

All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)

Language:PythonLicense:BSD-3-ClauseStargazers:114Issues:0Issues:0

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:711Issues:0Issues:0

self-correction-llm-papers

This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.

License:Apache-2.0Stargazers:330Issues:0Issues:0

SurgicalPart-SAM

Official implementation of SurgicalPart-SAM (SP-SAM)

Stargazers:11Issues:0Issues:0

stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Language:PythonLicense:MITStargazers:8146Issues:0Issues:0

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonLicense:NOASSERTIONStargazers:4640Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:643Issues:0Issues:0

LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

Language:PythonLicense:Apache-2.0Stargazers:903Issues:0Issues:0

SurgicalGym

High-performance GPU-based simulation platform for reinforcement learning with surgical robot learning

Language:PythonLicense:MITStargazers:39Issues:0Issues:0

DEX

[ICRA'23] Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot

Language:PythonLicense:MITStargazers:29Issues:0Issues:0

alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Language:Jupyter NotebookLicense:MITStargazers:3699Issues:0Issues:0

controlgym

Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms

Language:PythonLicense:MITStargazers:15Issues:0Issues:0

IROS2023PaperList

IROS2023 Paper List

Stargazers:101Issues:0Issues:0

Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction

Language:Jupyter NotebookLicense:MITStargazers:1906Issues:0Issues:0

HyQ

Official code repo for paper: Hybrid RL: Using both offline and online data can make RL efficient.

Language:PythonStargazers:21Issues:0Issues:0

SurgicalSAM

Official implementation of SurgicalSAM

Language:PythonLicense:MITStargazers:49Issues:0Issues:0

mup

maximal update parametrization (µP)

Language:Jupyter NotebookLicense:MITStargazers:1206Issues:0Issues:0

prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Language:PythonLicense:MITStargazers:1307Issues:0Issues:0

Reinforcement-Learning-Papers

Related papers for reinforcement learning, including classic papers and latest papers in top conferences

License:MITStargazers:240Issues:0Issues:0

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:1190Issues:0Issues:0

test

Measuring Massive Multitask Language Understanding | ICLR 2021

Language:PythonLicense:MITStargazers:1000Issues:0Issues:0

Prompt4ReasoningPapers

[ACL 2023] Reasoning with Language Model Prompting: A Survey

License:MITStargazers:815Issues:0Issues:0

stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Language:PythonLicense:MITStargazers:4068Issues:0Issues:0

PPO

PPO implementation for OpenAI gym environment based on Unity ML Agents

Language:PythonStargazers:143Issues:0Issues:0

LLMsPracticalGuide

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

Stargazers:8820Issues:0Issues:0