Nickolay V. Shmyrev (nshmyrev)

nshmyrev

Geek Repo

Company:Alpha Cephei Inc

Location:Astrakhan, Russia

Home Page:https://alphacephei.com

Github PK Tool:Github PK Tool

Nickolay V. Shmyrev's starred repositories

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

pipecat

Open Source framework for voice and multimodal conversational AI

Language:PythonLicense:BSD-2-ClauseStargazers:1475Issues:13Issues:19

audiolazy

Expressive Digital Signal Processing (DSP) package for Python

Language:PythonLicense:GPL-3.0Stargazers:683Issues:58Issues:10

AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonLicense:MITStargazers:333Issues:18Issues:7

penn

Pitch Estimating Neural Networks (PENN)

Language:PythonLicense:MITStargazers:198Issues:9Issues:10

Vach

Real time streaming talking head

Language:PythonStargazers:192Issues:0Issues:0

TalkingHead

Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.

Language:JavaScriptLicense:MITStargazers:168Issues:1Issues:6

audioset-processing

Toolkit for downloading and processing Google's AudioSet dataset.

Language:Jupyter NotebookLicense:MITStargazers:149Issues:2Issues:6

TextyMcSpeechy

Easily create text-to-speech models in any voice for rhasspy/piper. Make a text-to-speech model with your own voice recordings, or use thousands of RVC voices. Works offline on a Raspberry pi.

Language:ShellLicense:MITStargazers:149Issues:0Issues:0

ssr_eval

Evaluation and Benchmarking of Speech Super-resolution Methods

timething

Timething is a library for aligning text transcripts with their audio recordings.

Language:Jupyter NotebookLicense:MITStargazers:80Issues:1Issues:21

audio-preprocessing-scripts

数据集自动化制作脚本

Language:PythonLicense:MITStargazers:70Issues:3Issues:2

UnitySpeechToText

A native Unity plugin to convert speech to text on Android & iOS

Language:C#License:MITStargazers:61Issues:3Issues:3

bandit

BandIt: Cinematic Audio Source Separation

Language:PythonLicense:Apache-2.0Stargazers:52Issues:0Issues:0

StyleTalk

Official release of StyleTalk dataset.

License:MITStargazers:42Issues:0Issues:0

FlashSpeech

FlashSpeech: Efficient Zero-Shot Speech Synthesis

Stargazers:29Issues:0Issues:0

audio_diarization_annotation

Audio Diarization Annotation tool

Language:JavaScriptLicense:Apache-2.0Stargazers:21Issues:0Issues:0

gazelle-train

Joint speech-language model - respond directly to audio!

Language:PythonLicense:Apache-2.0Stargazers:21Issues:0Issues:0

nlp-rus-zaliz

Processing the grammar dictionary of A. A. Zaliznyak for morphological inflection

Language:AdaLicense:GPL-3.0Stargazers:16Issues:0Issues:0

Lightvoc

LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM

Language:Jupyter NotebookStargazers:15Issues:0Issues:0

MINETrans-IWSLT23

Official implementation of our IWSLT 2023 paper "The MineTrans Systems for IWSLT 2023 Offline Speech Translation and Speech-to-Speech Translation Tasks"

Language:PythonStargazers:14Issues:3Issues:0
Language:Jupyter NotebookLicense:GPL-3.0Stargazers:12Issues:3Issues:0

supervoice-enhance

Supervoice diffusion enhance

Language:Jupyter NotebookStargazers:9Issues:0Issues:0

minTorToiSe

A minimal PyTorch re-implementation of TorToiSe-tts inference

Language:PythonLicense:AGPL-3.0Stargazers:6Issues:6Issues:1
Language:PythonStargazers:4Issues:0Issues:0

StreamVC

An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".

Language:PythonLicense:MITStargazers:4Issues:0Issues:0

speech_evaluation

A toolkit dedicate for speech evaluation.

License:Apache-2.0Stargazers:2Issues:0Issues:0

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

License:Apache-2.0Stargazers:2Issues:0Issues:0