Beast code in Giters

Tanel Alumäe's starred repositories

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.030296 311 875

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Language:PythonMIT18441 144 257

cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language:PythonAGPL-3.08896 85 344

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08062 128 1034

axolotl

Go ahead and axolotl questions

Language:PythonApache-2.06456 48 579

AugLy

A data augmentations library for audio, image, text, and video.

Language:PythonNOASSERTION4911 73 74

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonMIT3050 42 188

rebiber

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Language:PythonMIT2495 15 29

diart

A python package to build AI-powered real-time audio applications

Language:PythonMIT850 20 139

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonApache-2.0577 20 52

spotty

Training deep learning models on AWS and GCP instances

Language:PythonMIT493 9 83

Scrapera

A universal package of scraper scripts for humans

Language:PythonMIT307 10 5

edgedict

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Language:PythonNOASSERTION285 27 14

VBx

Variational Bayes HMM over x-vectors diarization

Language:Python238 21 62

whisper-finetuning

[WIP] Scripts for fine-tuning Whisper

Language:PythonMIT194 7 19

snowfall

Moved to https://github.com/k2-fsa/icefall

Language:PythonApache-2.0143 35 100

sepia-stt-server

SEPIA server to support open-source speech recognition via WebSocket connection.

Language:PythonMIT114 12 9

truecase

A python true casing utility that restores case information for texts

Language:PythonApache-2.086 5 9

pkwrap

A pytorch wrapper for LF-MMI training and parallel training in Kaldi

Language:PythonNOASSERTION72 12 22

hyperion

Python toolkit for speech processing

Language:PythonApache-2.062 15 2

DCA-PLDA

Discriminative Condition-Aware PLDA

Language:PythonNOASSERTION40 5 8

kaldi-model-server

Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone

Language:JavaScriptApache-2.034 180

segments

Unicode Standard tokenization routines and orthography profile segmentation

Language:PythonApache-2.029 8 29

bbb-live-subtitles

BBB plugin for automatic subtitles in conference calls

Language:PythonApache-2.026 20 6

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonNOASSERTION19 10

S2N-release

Language:Python13 2 1

tts_preprocess_et

Estonian text-to-speech transliteration pipeline

Language:PythonMIT8 60

wavemap

🌊 mmap massive audio files as numpy 🌊

Language:PythonMIT6 5 2

build-pynini-wheels

Build `manylinux2014_x86_64` Python wheels for `pynini`, wrapping all its dependencies. This is a ServiceNow Research project that was started at Element AI.

Language:DockerfileNOASSERTION6 8 4

dolly-fi

Finnish version of databricks-dolly-15k instruction dataset

Language:Python3 90