splinter21's repositories

Framer

Official PyTorch implementation of "Framer: Interactive Frame Interpolation".

Stargazers:4Issues:0Issues:0

arxiv-translate-fix

arxiv翻译修复器!

Stargazers:0Issues:0Issues:0

breath-removal

Detect and remove or lower the volume of breathing in speech recordings.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

CELSDS

A Chinese Expressive Long-dialogue Speech Dataset with Scripts

Stargazers:0Issues:0Issues:0

Cosmos-Tokenizer

A suite of image and video neural tokenizers

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

echomimic_v2

EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

FastVideo

FastVideo is an open-source framework for accelerating large video diffusion model.

License:Apache-2.0Stargazers:0Issues:0Issues:0

fsspec_disk

万能硬盘!

Language:PythonStargazers:0Issues:0Issues:0

g2pW-Cantonese

Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW

License:Apache-2.0Stargazers:0Issues:0Issues:0

genmoai-models

The best OSS video generation models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

GetQzonehistory

获取QQ空间发布的历史说说

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

GIMM-VFI

[NeurIPS 2024] Generalizable Implicit Motion Modeling for Video Frame Interpolation

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

hertz-dev

first base model for full-duplex conversational audio

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

LTX-Video

Official repository for LTX-Video

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Neural-Codec-and-Speech-Language-Models

Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models

License:MITStargazers:0Issues:0Issues:0

py2many

Transpiler of Python to many other languages

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

REAL-Video-Enhancer

Interpolate and Upscale easily on Linux/Windows.

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

SimVQ

SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

License:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

TTSAudioNormalizer

TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.

Stargazers:0Issues:0Issues:0

vec2wav2.0

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0
License:AGPL-3.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

WavChat

A Survey of Spoken Dialogue Models (60 pages)

Stargazers:0Issues:0Issues:0

WaveFM

WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching

Language:PythonStargazers:0Issues:0Issues:0