svjack's repositories
1Prompt1Story
(ICLR 2025) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
CartoonSegmentation
Instance segmentation for cartoon/anime characters and some visual techniques building around it.
DiffusionAsShader
[arXiv 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
FastVideo
FastVideo is a lightweight framework for accelerating large video diffusion models.
HunyuanVideo-I2V
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
HunyuanVideoGP
HunyuanVideo GP: Large Video Generation Model - GPU Poor version
joycaption
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
leapfusion-hunyuan-image2video
A novel approach to hunyuan image-to-video sampling
Light-A-Video
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion
LTX-Video
Official repository for LTX-Video
magi
Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match texts to their speakers. Perform OCR.
midi-model
Midi event transformer for symbolic music generation
MotionClone
[ICLR 2025] Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
musiclang_predict
AI Prediction api of the MusicLang package
Show-o
[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
SkyReels-V1
SkyReels V1: The first and most advanced open-source human-centric video foundation model
star-vector
StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and textual inputs to produce high-quality SVG code with remarkable precision.
VideoModelStudio
Gradio webapp to train AI Video models using Finetrainers
Wan2GP
Wan 2.1 for the GPU Poor
YuE
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open