Takaaki Saeki (Takaaki-Saeki)

Takaaki-Saeki

Geek Repo

Company:Google

Location:Tokyo, Japan

Home Page:https://takaaki-saeki.github.io/

Github PK Tool:Github PK Tool

Takaaki Saeki's starred repositories

pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper

Language:PythonLicense:MITStargazers:200Issues:0Issues:0

TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonLicense:Apache-2.0Stargazers:162Issues:0Issues:0

PAM

PAM is a no-reference audio quality metric for audio generation tasks

Language:PythonLicense:MITStargazers:36Issues:0Issues:0

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Language:PythonStargazers:9731Issues:0Issues:0

pflow-encodec

Implementation of TTS model based on NVIDIA P-Flow TTS Paper

Language:PythonStargazers:64Issues:0Issues:0

DDDM-VC

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Language:PythonStargazers:157Issues:0Issues:0

Codec-SUPERB

Audio Codec Speech processing Universal PERformance Benchmark

Language:PythonStargazers:190Issues:0Issues:0

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonLicense:Apache-2.0Stargazers:11154Issues:0Issues:0

SpeechGPT

SpeechGPT Series: Speech Large Language Models

Language:PythonLicense:Apache-2.0Stargazers:1135Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:12016Issues:0Issues:0

DiscreteSpeechMetrics

Reference-aware automatic speech evaluation toolkit

Language:PythonLicense:MITStargazers:82Issues:0Issues:0

self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Language:PythonLicense:MITStargazers:1276Issues:0Issues:0
Language:PythonLicense:MITStargazers:10Issues:0Issues:0

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:PythonStargazers:545Issues:0Issues:0

ai-audio-startups

Community list of startups working with AI in audio and music technology

License:Apache-2.0Stargazers:1504Issues:0Issues:0

PLMpapers

Must-read Papers on pre-trained language models.

License:MITStargazers:3312Issues:0Issues:0

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

License:CC0-1.0Stargazers:16539Issues:0Issues:0

voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Language:PythonLicense:MITStargazers:569Issues:0Issues:0

contentvec

speech self-supervised representations

Language:PythonLicense:MITStargazers:439Issues:0Issues:0

CML-TTS-Dataset

CML-TTS: A Multilingual Dataset for Speech Synthesis

Language:HTMLStargazers:28Issues:0Issues:0

uroman-python

Python wrapper around uroman tokenizer

Language:NixStargazers:12Issues:0Issues:0

miipher

Unofficial implementation of miipher

Language:PythonLicense:MITStargazers:97Issues:0Issues:0

vits2_pytorch

unofficial vits2-TTS implementation in pytorch

Language:PythonLicense:MITStargazers:470Issues:0Issues:0

SpeechMOS

Easy-to-Use Speech MOS predictors

Language:PythonLicense:MITStargazers:190Issues:0Issues:0

codellama

Inference code for CodeLlama models

Language:PythonLicense:NOASSERTIONStargazers:15726Issues:0Issues:0

randomized_positional_encodings

Randomized Positional Encodings Boost Length Generalization of Transformers

Language:PythonLicense:Apache-2.0Stargazers:75Issues:0Issues:0

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonLicense:MITStargazers:1057Issues:0Issues:0

Speech-Prompts-Adapters

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.

Stargazers:102Issues:0Issues:0

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonLicense:MITStargazers:2260Issues:0Issues:0

zm-text-tts

[IJCAI'23] Learning to Speak from Text for Low-Resource TTS

Language:PythonLicense:Apache-2.0Stargazers:63Issues:0Issues:0