gdy1201's repositories

ACNet

ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

awesome-asr-contextualization

A curated list of awesome papers on contextualizing E2E ASR outputs

License:Apache-2.0Stargazers:0Issues:1Issues:0

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Awesome-pytorch-list

A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

Stargazers:0Issues:0Issues:0

Awesome-Visual-Transformer

Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

dgc

Dynamic Group Convolution for Accelerating Convolutional Neural Networks (ECCV 2020)

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

end2end-asr-pytorch

End-to-End Automatic Speech Recognition on PyTorch

License:MITStargazers:0Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

External-Attention-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

LAS_Mandarin_PyTorch

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

License:MITStargazers:0Issues:0Issues:0

LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

MQRNN

Multi-Quantile Recurrent Neural Network for Quantile Regression

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

License:MITStargazers:0Issues:0Issues:0

pointer_summarizer

pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding

License:MITStargazers:0Issues:0Issues:0

pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

License:MITStargazers:0Issues:0Issues:0

segan_pytorch

Speech Enhancement Generative Adversarial Network in PyTorch

License:MITStargazers:0Issues:0Issues:0

seq2seq

Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

slot_filling_and_intent_detection_of_SLU

slot filling, intent detection, joint training, ATIS & SNIPS datasets, the Facebook’s multilingual dataset, MIT corpus, E-commerce Shopping Assistant (ECSA) dataset, CoNLL2003 NER, ELMo, BERT, XLNet

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

SpectralCluster

Python re-implementation of the spectral clustering algorithm in the paper "Speaker Diarization with LSTM"

License:Apache-2.0Stargazers:0Issues:0Issues:0

Time-Series-Library

A Library for Advanced Deep Time Series Models.

License:MITStargazers:0Issues:0Issues:0

VGG-Speaker-Recognition

Utterance-level Aggregation For Speaker Recognition In The Wild

Language:PythonStargazers:0Issues:1Issues:0

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

Stargazers:0Issues:0Issues:0

wav2vec

a simplified version of wav2vec(1.0, vq, 2.0) in fairseq

Language:PythonStargazers:0Issues:0Issues:0