Abhinav Gupta's repositories
academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
dm_robotics
Libraries, tools and tasks created and used at DeepMind Robotics.
ede
Code for the paper "Uncertainty-Driven Exploration for Generalization in Reinforcement Learning".
evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
llama
Inference code for LLaMA models
llama-recipes
Examples and recipes for Llama model
LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
math
The MATH Dataset (NeurIPS 2021)
METER
METER: A Multimodal End-to-end TransformER Framework
mlx
MLX: An array framework for Apple silicon
mlx-examples
Examples in the MLX framework
mujoco_menagerie
A collection of high-quality models for the MuJoCo physics engine, curated by DeepMind.
openai-cookbook
Examples and guides for using the OpenAI API
openai-quickstart-python
Python example app from the OpenAI API quickstart tutorial
optax
Optax is a gradient processing and optimization library for JAX.
PromptPG
Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".
Stable-Alignment
Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
summarize-from-feedback
Code for "Learning to summarize from human feedback"
trl
Train transformer language models with reinforcement learning.