husonchen

Huson CHEN's starred repositories

photoshot

An open-source AI avatar generator web app - https://photoshot.app

Language:TypeScriptMIT346000

anomalydiffusion

[AAAI 2024] AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Language:Jupyter NotebookMIT13700

VITON-HD

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

Language:PythonNOASSERTION80200

flux

Official inference repo for FLUX.1 models

Language:PythonApache-2.01204000

MimicBrush

Official implementations for paper: Zero-shot Image Editing with Reference Imitation

Language:PythonApache-2.0104300

DDColor

[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

Language:PythonApache-2.0100300

OMG

[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models

Language:Python60700

AnyControl

[ECCV 2024] AnyControl, a multi-control image synthesis model that supports any combination of user provided control signals. 一个支持用户自由输入控制信号的图像生成模型，能够根据多种控制生成自然和谐的结果！

Language:PythonMIT10000

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

Language:PythonMIT389400

MS-Diffusion

Official implementation of MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance

Language:PythonMIT15300

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).

Language:PythonNOASSERTION61800

IMAGDressing

👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing

Language:PythonApache-2.090200

Cradle

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.

Language:PythonMIT157500

DINOv

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Language:Python35400

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.01893200

Oscar

Oscar and VinVL

Language:PythonMIT103500

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonApache-2.0471200

ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 60+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonApache-2.0308000

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonApache-2.01115000

typesense

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

Language:C++GPL-3.02001100

Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

Language:TypeScriptApache-2.03089200

awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

134400

mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Language:PythonMIT219200

InternLM

Official release of InternLM2.5 base and chat models. 1M context support

Language:PythonApache-2.0614400

nxtp

Object Recognition as Next Token Prediction (CVPR 2024)

Language:PythonNOASSERTION14700

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonApache-2.0274800

BlossomLM

中英双语对话式大型语言模型

Language:PythonApache-2.012900

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonApache-2.0152500