i-MaTh

i-MaTh

User data from Github https://github.com/i-MaTh

Company:East China Normal University

Location:Shanghai

GitHub:@i-MaTh

i-MaTh's repositories

Algorithm

记录一些常用算法的实现(涵盖常用的数据结构,机器学习以及语音识别中常用算法)

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

city_json

**城市json&港澳台、世界城市json

License:MITStargazers:0Issues:0Issues:0

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

cs-self-learning

计算机自学指南

License:NOASSERTIONStargazers:0Issues:0Issues:0

CVQ-VAE

[ICCV 2023] Online Clustered Codebook

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

dclm

DataComp for Language Models

Language:HTMLLicense:MITStargazers:0Issues:0Issues:0

flux

Official inference repo for FLUX.1 models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

friendly-stable-audio-tools

Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.

License:MITStargazers:0Issues:0Issues:0

HiFTNet

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

License:MITStargazers:0Issues:0Issues:0

lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

License:AGPL-3.0Stargazers:0Issues:0Issues:0

mean-opinion-score

Python library for calculating the mean opinion score and 95% confidence interval of the standard deviation of text-to-speech ratings according to Ribeiro et al. (2011).

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

mini-omni

open-source multimodel large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

License:MITStargazers:0Issues:0Issues:0

Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

NCE

Yingshi New Concept English

License:MITStargazers:0Issues:0Issues:0

openai-python

The official Python library for the OpenAI API

License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenDiloco

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Parrot-TTS

Official Code for ParrotTTS

Stargazers:0Issues:0Issues:0

PerceptiveAgent

Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))

License:Apache-2.0Stargazers:0Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

License:MITStargazers:0Issues:0Issues:0

RAVE

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:2Issues:0

speech-resynthesis

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

License:NOASSERTIONStargazers:0Issues:0Issues:0

spiritlm

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

License:NOASSERTIONStargazers:0Issues:0Issues:0

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

voice-chat-pdf

Use OpenAI's realtime API for a chatting with your documents

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.

Language:PythonStargazers:0Issues:0Issues:0

WavChat

A Survey of Spoken Dialogue Models (60 pages)

Stargazers:0Issues:0Issues:0