Beast code in Giters

Kai Jin's starred repositories

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonMIT433200

EVA

EVA Series: Visual Representation Fantasies from BAAI

Language:PythonMIT213200

TPD

This is the official repository for the paper "Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On". CVPR 2024

Language:Python6700

ShiArthur03

Language:MATLABGPL-3.01014100

Kolors

Kolors Team

Language:PythonApache-2.0263200

modelscope-classroom

Language:Jupyter NotebookApache-2.021900

MG-LLaVA

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Language:PythonApache-2.011800

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonApache-2.0158200

VisionLLM

VisionLLM Series

Language:PythonApache-2.075400

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonApache-2.01514600

DressCode

DressCode: Autoregressively Sewing and Generating Garments from Text Guidance.

9300

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonNOASSERTION284100

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.0804400

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.01830400

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonApache-2.01055300

LWM

Language:PythonApache-2.0702900

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT2960100

StableVITON

[CVPR2024] StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On

Language:Python85900

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookApache-2.0466000

open_clip

An open source implementation of CLIP.

Language:PythonNOASSERTION929600

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION439900

magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Language:PythonBSD-3-Clause1020500

FaceStudio

Put Your Face Everywhere in Seconds.

Apache-2.030800

cross_modal_adaptation

Cross-modal few-shot adaptation with CLIP

Language:PythonMIT29300

Yi

A series of large language models trained from scratch by developers @01-ai

Language:PythonApache-2.0750500

PSGAN

PyTorch code for "PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer" (CVPR 2020 Oral)

Language:PythonMIT71300

Deep3DFaceRecon_pytorch

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Language:PythonMIT160900

generative-models

Generative Models by Stability AI

Language:PythonMIT2341700

sd-webui-EasyPhoto

📷 EasyPhoto | Your Smart AI Photo Generator.

Language:PythonApache-2.0478000

faceswap

Deepfakes Software For All

Language:PythonGPL-3.04994700