Huson CHEN (husonchen)

husonchen

Geek Repo

Company: The Hong Kong University of Science and Technology

Github PK Tool:Github PK Tool

Huson CHEN's starred repositories

photoshot

An open-source AI avatar generator web app - https://photoshot.app

Language:TypeScriptLicense:MITStargazers:3460Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:2887Issues:0Issues:0

anomalydiffusion

[AAAI 2024] AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Language:Jupyter NotebookLicense:MITStargazers:137Issues:0Issues:0

VITON-HD

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

Language:PythonLicense:NOASSERTIONStargazers:802Issues:0Issues:0
Language:PythonStargazers:8Issues:0Issues:0

flux

Official inference repo for FLUX.1 models

Language:PythonLicense:Apache-2.0Stargazers:12040Issues:0Issues:0

MimicBrush

Official implementations for paper: Zero-shot Image Editing with Reference Imitation

Language:PythonLicense:Apache-2.0Stargazers:1043Issues:0Issues:0

DDColor

[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

Language:PythonLicense:Apache-2.0Stargazers:1003Issues:0Issues:0

OMG

[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models

Language:PythonStargazers:607Issues:0Issues:0

AnyControl

[ECCV 2024] AnyControl, a multi-control image synthesis model that supports any combination of user provided control signals. 一个支持用户自由输入控制信号的图像生成模型,能够根据多种控制生成自然和谐的结果!

Language:PythonLicense:MITStargazers:100Issues:0Issues:0

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

Language:PythonLicense:MITStargazers:3894Issues:0Issues:0

MS-Diffusion

Official implementation of MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance

Language:PythonLicense:MITStargazers:153Issues:0Issues:0

CatVTON

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).

Language:PythonLicense:NOASSERTIONStargazers:618Issues:0Issues:0

IMAGDressing

👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing

Language:PythonLicense:Apache-2.0Stargazers:902Issues:0Issues:0

Cradle

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.

Language:PythonLicense:MITStargazers:1575Issues:0Issues:0

DINOv

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Language:PythonStargazers:354Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18932Issues:0Issues:0

Oscar

Oscar and VinVL

Language:PythonLicense:MITStargazers:1035Issues:0Issues:0

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonLicense:Apache-2.0Stargazers:4712Issues:0Issues:0

ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 60+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonLicense:Apache-2.0Stargazers:3080Issues:0Issues:0

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:11150Issues:0Issues:0

typesense

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

Language:C++License:GPL-3.0Stargazers:20011Issues:0Issues:0

Langchain-Chatchat

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

Language:TypeScriptLicense:Apache-2.0Stargazers:30892Issues:0Issues:0

awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

Stargazers:1344Issues:0Issues:0

mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Language:PythonLicense:MITStargazers:2192Issues:0Issues:0

InternLM

Official release of InternLM2.5 base and chat models. 1M context support

Language:PythonLicense:Apache-2.0Stargazers:6144Issues:0Issues:0

nxtp

Object Recognition as Next Token Prediction (CVPR 2024)

Language:PythonLicense:NOASSERTIONStargazers:147Issues:0Issues:0

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2748Issues:0Issues:0

BlossomLM

中英双语对话式大型语言模型

Language:PythonLicense:Apache-2.0Stargazers:129Issues:0Issues:0

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:1525Issues:0Issues:0