Yuhua Jiang's repositories
CUMCM2020B
2020全国大学数学建模大赛 赛题B 穿越沙漠
Statistics-Project
应用统计与R语言大作业
NJU_Course_Project
Recorded projects completed in NJU
baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
ChatPaper
Use ChatGPT to summarize the arXiv papers.
ChatReviewer
ChatReviewer: use ChatGPT to review papers; ChatResponse: use ChatGPT to respond to reviewers.
DayDayCode
Online Judge 刷题
deeprl_network
multi-agent deep reinforcement learning for networked system control.
gpt_academic
为GPT/GLM提供图形交互界面,特别优化论文阅读润色体验,模块化设计支持自定义快捷按钮&函数插件,支持代码块表格显示,Tex公式双显示,新增Python和C++项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持清华chatglm等本地模型
gym-jsbsim
A reinforcement learning environment for aircraft control using the JSBSim flight dynamics model
ilkit
A clean code base for imitation learning and reinforcment learning , written in Pytorch
Jackory.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
LightZero
LightZero: A lightweight and efficient MCTS/AlphaZero/MuZero algorithm toolkit.
omnisafe
OmniSafe is an infrastructural framework for accelerating SafeRL research.
on-policy
This is the official implementation of Multi-Agent PPO (MAPPO).
Plants.VSZombies
CUI版植物大战僵尸
ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
spinningup
An educational resource to help anyone learn deep reinforcement learning.
tdmpc2
Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
trl
Train transformer language models with reinforcement learning.
VEM
Codes accompanying the paper "Offline Reinforcement Learning with Value-Based Episodic Memory" (ICLR 2022 https://arxiv.org/abs/2110.09796)