Haoran Duan's repositories
Adala
Adala: Autonomous DAta (Labeling) Agent framework
ArXivChatGuru
Use ArXiv ChatGuru to talk to research papers. This app uses LangChain, OpenAI, Streamlit, and Redis as a vector database/semantic cache.
ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
CogVLM
a state-of-the-art-level open visual language model
Dataset-Diffusion
Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)
deep-chat
Fully customizable AI chat component for your website
DeepSpeedExamples
Example models using DeepSpeed
DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
Eureka
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models"
fast-DiT
Fast Diffusion Models with Transformers
FoodSAM
FoodSAM: Any Food Segmentation
GenSim
GenSim: Generating Robotic Simulation Tasks via Large Language Models
groundingLMM
Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
idify
Make ID photo right in the browser.
MasQCLIP
(ICCV 2023) MasQCLIP for Open-Vocabulary Universal Image Segmentation
MosaicFusion
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
openpilot
openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
OutfitAnyone
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
panacea
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
rich-text-to-image
Rich-Text-to-Image Generation
sd-webui-EasyPhoto
📷 EasyPhoto | Your Smart AI Photo Generator.
TimeLlama
The official repo of TimeLlama, an instruction-finetuned Llama2 series that improve complex temporal reasoning ability.
Tracking-Anything-with-DEVA
[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
vditor
♏ 一款浏览器端的 Markdown 编辑器,支持所见即所得(富文本)、即时渲染(类似 Typora)和分屏预览模式。An In-browser Markdown editor, support WYSIWYG (Rich Text), Instant Rendering (Typora-like) and Split View modes.
waymax
A JAX-based simulator for autonomous driving research.
WebODM
User-friendly, commercial-grade software for processing aerial imagery. 🛩