Oz's repositories
ozzie00.github.io
Ozzie Zhang's website
act-plus-plus
Imitation Learning algorithms with Co-traing for Mobile ALOHA: ACT, Diffusion Policy, VINN
Awesome-LLM-Reasoning
Collection of papers and resources on Reasoning in Language Models (LLMs), including Chain-of-Thought, Instruction-Tuning, Multimodality.
Awesome-LLM-Robotics
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
Awesome-Open-Vocabulary
A Survey on Open Vocabulary Learning
Awesome-Reasoning-Foundation-Models
โจโจLatest Papers and Benchmarks in Reasoning with Foundation Models
ColossalAI
Making large AI models cheaper, faster and more accessible
daam
Diffusion attentive attribution maps for interpreting Stable Diffusion.
dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
E2B
Cloud Runtime for AI Agents
Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
HALOs
A library with extensible implementations of DPO, KTO, PPO, and other human-centered loss functions (HALOs).
LayerDiffusion
Transparent Image Layer Diffusion using Latent Transparency
lit-llam
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
marl-book_codebase
MARL-Book's official codebase repo. www.marl-book.com
Megatron-LM
Ongoing research training transformer models at scale
MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
ml-aim
This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models
mm-cot
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
MOSS-RLHF
MOSS-RLHF
octo
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
Theseus
Theseus is a modern OS written from scratch in Rust that explores ๐ข๐ง๐ญ๐ซ๐๐ฅ๐ข๐ง๐ ๐ฎ๐๐ฅ ๐๐๐ฌ๐ข๐ ๐ง: closing the semantic gap between compiler and hardware by maximally leveraging the power of language safety and affine types. Theseus aims to shift OS responsibilities like resource management into the compiler.
threestudio
A unified framework for 3D content generation.
visualnav-transformer
Official code and checkpoint release for "ViNT: A Foundation Model for Visual Navigation".