gdy1201

User data from Github https://github.com/gdy1201

followers

following

stars

gdy1201's repositories

ACNet

ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks

Language:PythonMIT000

annotated_deep_learning_paper_implementations

🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:Jupyter NotebookMIT000

awesome-asr-contextualization

A curated list of awesome papers on contextualizing E2E ASR outputs

Apache-2.0010

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0000

Awesome-pytorch-list

A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

000

Awesome-Visual-Transformer

Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

000

demand_forecast

000

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonNOASSERTION010

dgc

Dynamic Group Convolution for Accelerating Convolutional Neural Networks (ECCV 2020)

Language:PythonMIT010

end2end-asr-pytorch

End-to-End Automatic Speech Recognition on PyTorch

MIT000

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.0010

External-Attention-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language:PythonMIT010

FlightDelayPrediction

Language:HTML000

LAS_Mandarin_PyTorch

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

MIT000

LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Language:PythonApache-2.0010

MQRNN

Multi-Quantile Recurrent Neural Network for Quantile Regression

Language:Jupyter Notebook000

nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

Language:Jupyter NotebookMIT000

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch

MIT000

pointer_summarizer

pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

Language:PythonApache-2.0010

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding

MIT000

pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

MIT000

segan_pytorch

Speech Enhancement Generative Adversarial Network in PyTorch

MIT000

seq2seq

Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch

Language:PythonMIT000

slot_filling_and_intent_detection_of_SLU

slot filling, intent detection, joint training, ATIS & SNIPS datasets, the Facebook’s multilingual dataset, MIT corpus, E-commerce Shopping Assistant (ECSA) dataset, CoNLL2003 NER, ELMo, BERT, XLNet

Language:PythonApache-2.0010

SpecAugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Language:PythonApache-2.0010

SpectralCluster

Python re-implementation of the spectral clustering algorithm in the paper "Speaker Diarization with LSTM"

Apache-2.0000

Time-Series-Library

A Library for Advanced Deep Time Series Models.

MIT000

VGG-Speaker-Recognition

Utterance-level Aggregation For Speaker Recognition In The Wild

Language:Python010

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

000

wav2vec

a simplified version of wav2vec(1.0, vq, 2.0) in fairseq

Language:Python000