Bailin's starred repositories
mistral-src
Reference implementation of Mistral AI 7B v0.1 model.
llama-recipes
Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger
open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
GPU-Puzzles
Solve puzzles. Learn CUDA.
attention_with_linear_biases
Code for the ALiBi method for transformer language models (ICLR 2022)
SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
llm_large_context
Large Sequence Modeling with Transformers
explain-then-translate
Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations"
pytorch_linear_rnn
Implementations of various linear RNN layers using pytorch and triton
Logical-and-abstract-reasoning
Evaluation on Logical Reasoning and Abstract Reasoning Challenges