yhzhouowo (Mortyzhou-Shef-BIT)

Mortyzhou-Shef-BIT

Geek Repo

Location:UoS -> NUS & BIT

Home Page:https://mortyzaigc.netlify.app/

Github PK Tool:Github PK Tool

yhzhouowo's repositories

Awesome-Transformer-Attention

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

Stargazers:1Issues:0Issues:0

ppg-vc

PPG-Based Voice Conversion

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

Speech-Resources

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

License:MITStargazers:1Issues:0Issues:0

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

awesome-embodied-vision

Reading list for research topics in embodied vision

License:MITStargazers:0Issues:0Issues:0

Awesome-Multimodal-Research

A curated list of Multimodal Related Research.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

DYGANVC

source code for "DYGAN-VC: IMPROVING SPEECH CONTENT PRESERVATION FOR GAN VOICE CONVERSION USING DYNAMIC CONVOLUTION"

Language:PythonStargazers:0Issues:0Issues:0

speech-synthesis-paper

List of speech synthesis papers.

License:MITStargazers:0Issues:0Issues:0

Awesome-Cloud-Edge-AI

A curated list of research in System for Edge Intelligence and Computing(Edge MLSys), including Frameworks, Tools, Repository, etc. Paper notes are also provided.

License:MITStargazers:0Issues:0Issues:0

CMU-MultimodalSDK

CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

crank

A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

dialog_evaluation_paper_list

Dialog Evaluation Paper List: include multiple different dialog tasks

Stargazers:0Issues:0Issues:0

diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

espnet_model_zoo

ESPnet Model Zoo

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

FastVocoder

Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

gdown

Download a large file from Google Drive (curl/wget fails because of the security notice).

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

HiSD

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement" (CVPR 2021 Oral).

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

Pytorch-MBNet

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonStargazers:0Issues:0Issues:0

SpeechTransProgress

Tracking the progress in end-to-end speech translation

License:CC0-1.0Stargazers:0Issues:0Issues:0

StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Talking-Face_PC-AVS

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Language:PythonLicense:CC-BY-4.0Stargazers:0Issues:0Issues:0

TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

tango

Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model"

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:0Issues:0Issues:0

VQMIVC

Official implementation of VQMIVC: One-shot Voice Conversion @ Interspeech 2021

Language:PythonLicense:MITStargazers:0Issues:0Issues:0