YuxiangJohn

YuxiangJohn's starred repositories

metahuman-stream

Real time interactive streaming digital human

Language:PythonApache-2.0245100

flux

Official inference repo for FLUX.1 models

Language:PythonApache-2.0293100

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.0813600

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonApache-2.0369200

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonNOASSERTION526700

X-Pose

[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"

Language:PythonNOASSERTION33200

LLM4GEN

3000

BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Language:PythonMIT35600

MInference

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

Language:PythonMIT60900

ShiArthur03

Language:MATLABGPL-3.01031200

DiffSynth-Studio

Enjoy the magic of Diffusion models!

Language:PythonApache-2.0604900

GlueGen

Language:PythonApache-2.05700

LivePortrait

Bring portraits to life!

Language:PythonNOASSERTION930900

GlyphDraw2

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Language:PythonMIT1900

LLM101n

LLM101n: Let's build a Storyteller

2648600

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonApache-2.0162300

BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

Language:HTMLApache-2.0776200

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonMIT459200

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02838300

swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonApache-2.0270600