James Chang's repositories
anomalib
An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
anything-llm
Open-source ChatGPT experience for both open and closed source LLMs, embedders, and vector databases. Unlimited documents, messages, and concurrent users with permission management in one app. 👉 Desktop app beta: https://mintplexlabs.typeform.com/to/sFgD2TIb
chroma
the AI-native open-source embedding database
ColossalAI
Making large AI models cheaper, faster and more accessible
contrastors
Train Models Contrastively in Pytorch
flash-linear-attention
Fast implementations of causal linear attention for autogressive language modeling (Pytorch)
gpu-docker-api
Easier than K8s to lift and lower the gpu number of docker container and scale capacity size of volume.
LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
lightning-attention
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
LLaMA-Pro
Progressive LLaMA with Block Expansion.
lobe-chat
🤖 Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.
LongAlign
LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation
ml-aim
This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models
mllm-perceptual-limitation
Code and data for paper 'Exploring Perceptual Limitation of Multimodal Large Language Models'
MobileAgent
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
OLMo
Modeling, training, eval, and inference code for OLMo
PFLlib
Personalized federated learning simulation platform with non-IID and unbalanced dataset
QAnything
Question and Answer based on Anything.
SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
Vary
Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
whisper_dictation
Fast! Offline, privacy-focused, hands-free voice typing, 2-way AI voice chat, with images, voice control, in under 4 GiB of VRAM.