KiAlexander's starred repositories

google-research

Google Research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:33143Issues:749Issues:1191

tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Language:PythonLicense:MITStargazers:24533Issues:265Issues:619

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8045Issues:128Issues:1034

lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

Language:PythonLicense:Apache-2.0Stargazers:1813Issues:55Issues:38

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Language:PythonLicense:MITStargazers:1790Issues:32Issues:159

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonLicense:NOASSERTIONStargazers:1579Issues:37Issues:149

packnet-sfm

TRI-ML Monocular Depth Estimation Repository

Language:PythonLicense:MITStargazers:1199Issues:56Issues:228

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaLicense:Apache-2.0Stargazers:1058Issues:77Issues:369

Res2Net-PretrainedModels

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

audino

Open source audio annotation tool for humans

Language:JavaScriptLicense:MITStargazers:1027Issues:24Issues:56

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonLicense:Apache-2.0Stargazers:877Issues:44Issues:392

transformer

Implementation of Transformer model (originally from Attention is All You Need) applied to Time Series.

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:819Issues:15Issues:58

Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

open-aff

code and trained models for "Attentional Feature Fusion"

transformer

A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"

Lipreading_using_Temporal_Convolutional_Networks

ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASSP'20 Lipreading using Temporal Convolutional Networks

Language:PythonLicense:NOASSERTIONStargazers:368Issues:9Issues:63

pika

a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi

Language:PythonLicense:Apache-2.0Stargazers:338Issues:14Issues:11

pystoi

Python implementation of the Short Term Objective Intelligibility measure

Language:MATLABLicense:MITStargazers:310Issues:12Issues:19

torchsummaryX

torchsummaryX: Improved visualization tool of torchsummary

MMAL-Net

This is a PyTorch implementation of the paper "Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization (MMAL-Net)" (Fan Zhang, Meng Li, Guisheng Zhai, Yizhao Liu).

Multi-Scale-1D-ResNet

pytorch code of multi scale 1d resnet, we hope it will help your research

Language:PythonLicense:MITStargazers:200Issues:2Issues:4

avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

Language:PythonLicense:MITStargazers:110Issues:11Issues:9

ConferencingSpeech2021

Conferencing Speech Challenge

Language:PythonLicense:Apache-2.0Stargazers:87Issues:8Issues:11

active-speakers-context

Code for the Active Speakers in Context Paper (CVPR2020)

Language:PythonLicense:MITStargazers:46Issues:3Issues:1

pytorch_complex

A temporal module for PyTorch-ComplexTensor

gan-torch

gan_torch (cpu & gpu)

Language:Jupyter NotebookStargazers:30Issues:1Issues:1

biased_separation

Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation

Language:PythonLicense:AGPL-3.0Stargazers:14Issues:4Issues:0
Language:PythonStargazers:1Issues:0Issues:0