BeanDAO's repositories
anynode
A Node for ComfyUI that does what you ask it to do
basic-pitch
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
ComfyUI-DynamiCrafterWrapper
Wrapper to use DynamiCrafter models in ComfyUI
ComfyUI_MagicClothing
unofficial implementation of Comfyui magic clothing
CosyVoice-ComfyUI
a comfyui custom node for CosyVoice
cube
📊 Cube — The Semantic Layer for Building Data Applications
draw-things-community
The community repository for the Draw Things app.
FunClip
Open-source, accurate and easy-to-use video clipping tool, LLM based AI clipping intergrated || 开源、精准、方便的视频切片工具,集成了大语言模型AI智能剪辑功能
GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
marker
Convert PDF to markdown quickly with high accuracy
MimicBrush
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
MOFA-Video
Official Pytorch implementation for MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
MusePose
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Omost
Your image is almost there!
OpenUtau
Open singing synthesis platform / Open source UTAU successor
pykan
Kolmogorov Arnold Networks
pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
SEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language Model
ShareGPT4Video
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
StyleFeatureEditor
Official Implementation for "The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing"
supersonic
SuperSonic is the next-generation BI platform that integrates Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
ToonCrafter
a research paper for generative cartoon interpolation
V-Express
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
vanna
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.