Jackory

Yuhua Jiang's repositories

RPBT

Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)

Language:PythonMIT10 10

CUMCM2020B

2020全国大学数学建模大赛赛题B 穿越沙漠

Language:Python4 2 1

Statistics-Project

应用统计与R语言大作业

Language:R4 20

TCGAIIC

天池人工智能技术创新大赛赛道三

Language:Python3 20

NJU_Course_Project

Recorded projects completed in NJU

Language:Jupyter NotebookGPL-3.02 20

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonNOASSERTION100

Image_Classification

图片分类

Language:Python1 20

baby-llama2-chinese

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库；24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Language:PythonMIT000

CDS

Language:PythonApache-2.0010

ChatPaper

Use ChatGPT to summarize the arXiv papers.

Language:PythonNOASSERTION000

ChatReviewer

ChatReviewer: use ChatGPT to review papers; ChatResponse: use ChatGPT to respond to reviewers.

Language:PythonNOASSERTION000

Competition_Olympics-Integrated

Language:PythonMIT000

DayDayCode

Online Judge 刷题

Language:C++020

deeprl_network

multi-agent deep reinforcement learning for networked system control.

Language:Python010

ElegantRL

Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

Language:PythonNOASSERTION010

gpt_academic

为GPT/GLM提供图形交互界面，特别优化论文阅读润色体验，模块化设计支持自定义快捷按钮&函数插件，支持代码块表格显示，Tex公式双显示，新增Python和C++项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持清华chatglm等本地模型

Language:PythonGPL-3.0000

gym-jsbsim

A reinforcement learning environment for aircraft control using the JSBSim flight dynamics model

Language:PythonMIT010

ilkit

A clean code base for imitation learning and reinforcment learning , written in Pytorch

Language:PythonMIT000

images

010

Jackory.github.io

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:JavaScriptMIT000

LightZero

LightZero: A lightweight and efficient MCTS/AlphaZero/MuZero algorithm toolkit.

Language:PythonApache-2.0000

omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.

Language:PythonApache-2.0000

on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

Language:PythonMIT000

Plants.VSZombies

CUI版植物大战僵尸

Language:C++020

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.

Language:PythonApache-2.0000

spinningup

An educational resource to help anyone learn deep reinforcement learning.

Language:PythonMIT010

tdmpc2

Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"

Language:PythonMIT000

TimeChamber

Language:PythonMIT000

trl

Train transformer language models with reinforcement learning.

Apache-2.0000

VEM

Codes accompanying the paper "Offline Reinforcement Learning with Value-Based Episodic Memory" (ICLR 2022 https://arxiv.org/abs/2110.09796)

Language:Python000