Abhinav Gupta's repositories
academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
dm_robotics
Libraries, tools and tasks created and used at DeepMind Robotics.
ede
Code for the paper "Uncertainty-Driven Exploration for Generalization in Reinforcement Learning".
evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
inference
Reference implementations of MLPerf™ inference benchmarks
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
llama
Inference code for LLaMA models
llama-recipes
Examples and recipes for Llama model
LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
math
The MATH Dataset (NeurIPS 2021)
mlx
MLX: An array framework for Apple silicon
mlx-examples
Examples in the MLX framework
openai-cookbook
Examples and guides for using the OpenAI API
openai-quickstart-python
Python example app from the OpenAI API quickstart tutorial
OpenDevin
🐚 OpenDevin: Code Less, Make More
optax
Optax is a gradient processing and optimization library for JAX.
PromptPG
Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".
reward-bench
RewardBench: the first evaluation tool for reward models.
Stable-Alignment
Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
summarize-from-feedback
Code for "Learning to summarize from human feedback"
SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.29% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.
training
Reference implementations of MLPerf™ training benchmarks
trl
Train transformer language models with reinforcement learning.