sheng-han-zhang

sheng-han-zhang's repositories

AIWolfPy

Language:Python100

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookApache-2.0000

awesome-game-ai

Awesome Game AI materials of Multi-Agent Reinforcement Learning

MIT000

Deep-Reinforcement-Learning-Algorithms-with-PyTorch

PyTorch implementations of deep reinforcement learning algorithms and environments

Language:Python000

Deep-Reinforcement-Learning-Hands-On

Hands-on Deep Reinforcement Learning, published by Packt

Language:PythonMIT000

DeepRole

The code used to power DeepRole

Language:C++000

DI-star

An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

Language:PythonApache-2.0000

FastChat

An open platform for training, serving, and evaluating large languages. Release repo for Vicuna and FastChat-T5.

Language:PythonApache-2.0000

jynew

金庸群侠传3D重制版

Language:C#NOASSERTION000

lykos

Werewolf, the popular detective/social party game (a theme of Mafia)

Language:PythonNOASSERTION000

MARL-Papers

Paper list of multi-agent reinforcement learning (MARL)

000

mcts

An implementation of Monte Carlo Tree Search in python

Language:PythonBSD-2-Clause000

mathematics_dataset

This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty.

Language:PythonApache-2.0000

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION000

melee-ai

Super Smash Bros. Melee (SSBM) AI

Language:PythonGPL-3.0000

minerl_imitation_learning

MIT000

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.

MIT000

ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

MIT000

PyIMDB

In-memory database for python like a Redis(?). It's my learning sandbox of grpc.

MIT000

rl-baselines3-zoo

A collection of pre-trained RL agents using Stable Baselines3, training and hyperparameter optimization included.

Language:PythonMIT000

sac-discrete-pytorch

MIT000

sac-discrete.pytorch

A PyTorch implementation of SAC-Discrete.

MIT000

shakespeare

The Complete Works of William Shakespeare hosted at http://shakespeare.mit.edu/

000

stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

MIT000

tianshou

An elegant PyTorch deep reinforcement learning platform.

MIT000

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

MIT000