schinger

wangxuguang's repositories

Trains an agent with (stochastic) Policy Gradients(actor-critic) on Pong. Uses OpenAI Gym.

Language:Python12 4 2

Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)

Language:PythonMIT100

Simplest AlphaZero Implementation

Language:PythonMIT000

Implement Diffusion Model only by Pytorch and MLP

Language:PythonMIT000

000

Inference Llama 2 in one file of pure C

Language:CMIT000

Models and examples built with TensorFlow

Language:PythonApache-2.0000

Moses, the machine translation system

Language:GroffLGPL-2.1000

PArallel Distributed Deep LEarning

Language:C++Apache-2.0000

PPO in one file

Language:Python000

The Big-&-Extending-Repository-of-Transformers: PyTorch pretrained models for Google's BERT, OpenAI GPT & GPT-2, Google/CMU Transformer-XL.

Language:Jupyter NotebookApache-2.0000