Ziniu Li (liziniu)

liziniu

Geek Repo

Company:The Chinese University of Hong Kong, Shenzhen

Location:Shenzhen

Home Page:www.liziniu.org

Twitter:@ziniuli

Github PK Tool:Github PK Tool

Ziniu Li's repositories

ReMax

Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)

policy_optimization

Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)

Language:PythonStargazers:23Issues:1Issues:0

RL-PPO-Keras

Proximal Policy Optimization(PPO) with Keras Implementation

HyperDQN

Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)

Language:PythonStargazers:9Issues:1Issues:0

ISWBC

Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)

Language:PythonStargazers:4Issues:1Issues:0
Language:PythonStargazers:3Issues:1Issues:0

RLX

RLX is an RL codebase based on TensorFlow. It implements algorithms like SAC, ACER, GAIL and TRPO. It is easy to use.

Language:PythonStargazers:3Issues:2Issues:0
Language:PythonStargazers:1Issues:2Issues:0

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

baby-llama2-chinese

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:2Issues:0

Chinese-LLaMA-Alpaca-2

中文 LLaMA-2 & Alpaca-2 大模型二期项目 + 本地CPU/GPU训练部署 (Chinese LLaMA-2 & Alpaca-2 LLMs)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

clash-for-linux

Linux 端使用 Clash 作为代理工具

Language:ShellStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:2Issues:0

deep-learning-notes

Experiments with Deep Learning

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

go-explore

Code for Go-Explore: a New Approach for Hard-Exploration Problems

Language:PythonLicense:NOASSERTIONStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:1Issues:0

gym-minigrid

Minimalistic gridworld package for OpenAI Gym

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:2Issues:0
Language:HTMLLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:1Issues:0

Model-Uncertainty-in-Neural-Networks

TensorFlow implementation of Model-Uncertainty-in-Neural-Networks

Language:Jupyter NotebookStargazers:0Issues:2Issues:0

random-network-distillation

Code for the paper "Exploration by Random Network Distillation"

Language:PythonStargazers:0Issues:3Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

sample-efficient-bayesian-rl

Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:2Issues:0
Language:Jupyter NotebookStargazers:0Issues:2Issues:0

webpage-template

Adapted from the widely used project webpage template made by the colorful folks.

Language:HTMLStargazers:0Issues:0Issues:0