YouTaoBaBa's starred repositories

Euler-Smea-Dyn-Sampler

A sampler base on Euler, aim at generating better picture/一种基于Euler的采样方法,旨在生成更好的图片

Language:PythonLicense:Apache-2.0Stargazers:158Issues:0Issues:0

bypy

Python client for Baidu Yun (Personal Cloud Storage) 百度云/百度网盘Python客户端

Language:PythonLicense:MITStargazers:7836Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:386Issues:0Issues:0

HivisionIDPhotos

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Language:PythonLicense:Apache-2.0Stargazers:10487Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:13615Issues:0Issues:0

ChineseNLPCorpus

中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。

Language:PythonStargazers:4257Issues:0Issues:0

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonLicense:Apache-2.0Stargazers:2022Issues:0Issues:0

RB-Modulation

Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:322Issues:0Issues:0

CatVTON

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).

Language:PythonLicense:NOASSERTIONStargazers:789Issues:0Issues:0

Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:2433Issues:0Issues:0

MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Language:PythonLicense:NOASSERTIONStargazers:2509Issues:0Issues:0

CSGO

CSGO: Content-Style Composition in Text-to-Image Generation 🔥

Language:Jupyter NotebookStargazers:227Issues:0Issues:0

GeoWizard

[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Language:PythonStargazers:721Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:131Issues:0Issues:0

Paints-UNDO

Understand Human Behavior to Align True Needs

Language:PythonLicense:Apache-2.0Stargazers:3313Issues:0Issues:0

SimpleTuner

A general fine-tuning kit geared toward diffusion models.

Language:PythonLicense:AGPL-3.0Stargazers:1606Issues:0Issues:0

MindChat

🐋MindChat(漫谈)——心理大模型:漫谈人生路, 笑对风霜途

Language:PythonLicense:GPL-3.0Stargazers:584Issues:0Issues:0

CLUECorpus2020

Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料

License:MITStargazers:914Issues:0Issues:0

VADER

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.

Language:PythonStargazers:197Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:11292Issues:0Issues:0

MoneyPrinterTurbo

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

Language:PythonLicense:MITStargazers:16331Issues:0Issues:0

UniPortrait

UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalizations

License:Apache-2.0Stargazers:168Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:1433Issues:0Issues:0

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1331Issues:0Issues:0

MimicMotion

High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Language:PythonLicense:NOASSERTIONStargazers:1677Issues:0Issues:0

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:1847Issues:0Issues:0

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.

Language:TypeScriptLicense:Apache-2.0Stargazers:23085Issues:0Issues:0

flux

Application Architecture for Building User Interfaces

Language:JavaScriptLicense:NOASSERTIONStargazers:17450Issues:0Issues:0

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonLicense:NOASSERTIONStargazers:2146Issues:0Issues:0

ToonCrafter

[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation

Language:PythonLicense:Apache-2.0Stargazers:5187Issues:0Issues:0