KMedia's repositories
CodeFormer
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
CoreML-Models
Converted CoreML Model Zoo.
ProPainter
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
AnimateDiff
Official implementation of AnimateDiff.
Applio
Ultimate voice cloning tool, meticulously optimized for unrivaled power, modularity, and user-friendly experience.
Awesome-GitHub-Repo
收集整理 GitHub 上高质量、有趣的开源项目。
bark
🔊 Text-Prompted Generative Audio Model
ComfyUI
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
facefusion
Next generation face swapper and enhancer
ffmpeg-apple-arm64-build
Build script for ffmpeg targeting the latest open source video codecs running on macOS using Apple's M1 processor.
freeswitch
FreeSWITCH is a Software Defined Telecom Stack enabling the digital transformation from proprietary telecom switches to a versatile software implementation that runs on any commodity hardware. From a Raspberry PI to a multi-core server, FreeSWITCH can unlock the telecommunications potential of any device.
GeneFace
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
image-colorization-api
Image Colorization Service using Deep Learning is a repository that provides an API for colorizing black and white images using U-Net and conditional GAN models trained on the COCO dataset, with support for batch processing, dataset expansion, model experiments, and efficient inference using ONNX format.
InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
lite.ai.toolkit
🛠 A lite C++ toolkit of awesome AI models with ONNXRuntime, NCNN, MNN and TNN. YOLOv5, YOLOX, YOLOP, YOLOv6, YOLOR, MODNet, YOLOX, YOLOv7, YOLOv8. MNN, NCNN, TNN, ONNXRuntime.
magic-animate
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
media-server-1
librtsp/librtmp/libmpeg/libhls/librtp
netron
Visualizer for neural network, deep learning and machine learning models
OpenVoice
Instant voice cloning by MyShell.
Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
RealtimeSTT
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription. Designed for real-time applications like voice assistants.
stable-diffusion
A latent text-to-image diffusion model
stable-diffusion.cpp
Stable Diffusion in pure C/C++
TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
XRG
System monitor for macOS.