James's repositories
nanoTransformer
A PyTorch-based featuring an efficiently implemented Transformer model. The core of our attention mechanisms is powered by torch.einsum, ensuring clean, readable, and highly optimized tensor operations.
amago
a simple and scalable agent for training adaptive policies with sequence-based RL
anthropic-cookbook
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
autogen
Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
litgpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
LLaMA-Factory
Unify Efficient Fine-tuning of 100+ LLMs
llm-foundry
LLM training code for MosaicML foundation models
llm.c
LLM training in simple, raw C/CUDA
LLMLingua
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
long-range-arena
Long Range Arena for Benchmarking Efficient Transformers
OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
routerbench
The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System
ScaleLLM
A high-performance inference system for large language models, designed for production environments.
SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
Time-LLM
[ICLR 2024] Official implementation of "Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Transformers_Are_What_You_Dont_Need
The best repository showing why transformers don’t work in time series forecasting and showcasing the best SOTA non transformer models.