Tanel Alumäe's starred repositories

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonLicense:Apache-2.0Stargazers:30296Issues:311Issues:875

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Language:PythonLicense:MITStargazers:18441Issues:144Issues:257

cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language:PythonLicense:AGPL-3.0Stargazers:8896Issues:85Issues:344

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8062Issues:128Issues:1034

axolotl

Go ahead and axolotl questions

Language:PythonLicense:Apache-2.0Stargazers:6456Issues:48Issues:579

AugLy

A data augmentations library for audio, image, text, and video.

Language:PythonLicense:NOASSERTIONStargazers:4911Issues:73Issues:74

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:3050Issues:42Issues:188

rebiber

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Language:PythonLicense:MITStargazers:2495Issues:15Issues:29

diart

A python package to build AI-powered real-time audio applications

Language:PythonLicense:MITStargazers:850Issues:20Issues:139

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonLicense:Apache-2.0Stargazers:577Issues:20Issues:52

spotty

Training deep learning models on AWS and GCP instances

Language:PythonLicense:MITStargazers:493Issues:9Issues:83

Scrapera

A universal package of scraper scripts for humans

Language:PythonLicense:MITStargazers:307Issues:10Issues:5

edgedict

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Language:PythonLicense:NOASSERTIONStargazers:285Issues:27Issues:14

VBx

Variational Bayes HMM over x-vectors diarization

whisper-finetuning

[WIP] Scripts for fine-tuning Whisper

Language:PythonLicense:MITStargazers:194Issues:7Issues:19

snowfall

Moved to https://github.com/k2-fsa/icefall

Language:PythonLicense:Apache-2.0Stargazers:143Issues:35Issues:100

sepia-stt-server

SEPIA server to support open-source speech recognition via WebSocket connection.

Language:PythonLicense:MITStargazers:114Issues:12Issues:9

truecase

A python true casing utility that restores case information for texts

Language:PythonLicense:Apache-2.0Stargazers:86Issues:5Issues:9

pkwrap

A pytorch wrapper for LF-MMI training and parallel training in Kaldi

Language:PythonLicense:NOASSERTIONStargazers:72Issues:12Issues:22

hyperion

Python toolkit for speech processing

Language:PythonLicense:Apache-2.0Stargazers:62Issues:15Issues:2

DCA-PLDA

Discriminative Condition-Aware PLDA

Language:PythonLicense:NOASSERTIONStargazers:40Issues:5Issues:8

kaldi-model-server

Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone

Language:JavaScriptLicense:Apache-2.0Stargazers:34Issues:18Issues:0

segments

Unicode Standard tokenization routines and orthography profile segmentation

Language:PythonLicense:Apache-2.0Stargazers:29Issues:8Issues:29

bbb-live-subtitles

BBB plugin for automatic subtitles in conference calls

Language:PythonLicense:Apache-2.0Stargazers:26Issues:20Issues:6

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:19Issues:1Issues:0

tts_preprocess_et

Estonian text-to-speech transliteration pipeline

Language:PythonLicense:MITStargazers:8Issues:6Issues:0

wavemap

🌊 mmap massive audio files as numpy 🌊

Language:PythonLicense:MITStargazers:6Issues:5Issues:2

build-pynini-wheels

Build `manylinux2014_x86_64` Python wheels for `pynini`, wrapping all its dependencies. This is a ServiceNow Research project that was started at Element AI.

Language:DockerfileLicense:NOASSERTIONStargazers:6Issues:8Issues:4

dolly-fi

Finnish version of databricks-dolly-15k instruction dataset

Language:PythonStargazers:3Issues:9Issues:0