wangxuguang's repositories

pong_actor-critic

Trains an agent with (stochastic) Policy Gradients(actor-critic) on Pong. Uses OpenAI Gym.

FullLLM

Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

AlphaZero

Simplest AlphaZero Implementation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

DiffusionModel

Implement Diffusion Model only by Pytorch and MLP

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

llama2.c

Inference Llama 2 in one file of pure C

Language:CLicense:MITStargazers:0Issues:0Issues:0

models

Models and examples built with TensorFlow

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mosesdecoder

Moses, the machine translation system

Language:GroffLicense:LGPL-2.1Stargazers:0Issues:0Issues:0

Paddle

PArallel Distributed Deep LEarning

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

PPO-simplest

PPO in one file

Language:PythonStargazers:0Issues:0Issues:0

pytorch-pretrained-BERT

The Big-&-Extending-Repository-of-Transformers: PyTorch pretrained models for Google's BERT, OpenAI GPT & GPT-2, Google/CMU Transformer-XL.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0