t13m

followers

following

stars

t13m's repositories

kaldi-readers-for-tensorflow

readers that enable reading kaldi ark in tensorflow

Language:C++17 5 1

react-pixijs-renderer

Language:TypeScript1 10

athena

an open-source implementation of sequence-to-sequence based speech processing engine

Language:PythonApache-2.0010

athena-decoder

Language:PythonApache-2.0010

BookContainer

010

chakra-knob

Language:TypeScript010

DawDreamer

Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors

Language:C++GPL-3.0000

DeepSpeech

A TensorFlow implementation of Baidu's DeepSpeech architecture

Language:C++MPL-2.0020

electron-better-ipc

Simplified IPC communication for Electron apps

Language:JavaScriptMIT000

GoBigger

OpenDILab Multi-Agent Environment

Language:PythonApache-2.0000

kaldi

This is the official location of the Kaldi project.

Language:ShellNOASSERTION010

LabSound

:microscope: :speaker: graph-based audio engine

Language:C++NOASSERTION000

LatticeCtc

A Tensorflow extension to calculate CTC loss against lattices instead of linear sequences.

Language:C++020

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:PythonBSD-3-Clause000

leva

🌋 React-first components GUI

Language:TypeScriptMIT000

naudiodon

Node.js stream bindings for PortAudio

Language:C++Apache-2.0000

NeuFA

Neural network-based forced alignment with bidirectional attention mechanism

Language:Python000

node-audio

Graph-based audio api for Node.js based on LabSound and JUCE

Language:C++000

OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Language:PythonApache-2.0010

pydrobert-kaldi

SWIG bindings for Kaldi I/O, built with Conda

Language:C++Apache-2.0020

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector

Language:PythonMIT000

Soundpipe

A lightweight music DSP library.

Language:CMIT000

tensorflow

Computation using data flow graphs for scalable machine learning

Language:C++Apache-2.0020

Text-to-sound-Synthesis

The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"

Language:Python000

The-Art-of-Linear-Algebra

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Language:TeXCC0-1.0000

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech)

Apache-2.0000

warp-ctc

Fast parallel CTC.

Language:CudaApache-2.0020

waveform-playlist

Multitrack Web Audio editor and player with canvas waveform preview. Set cues, fades and shift multiple tracks in time. Record audio tracks or provide audio annotations. Export your mix to AudioBuffer or WAV! Project inspired by Audacity.

Language:JavaScriptMIT000

wenet

Transformer based ASR Engine.

Language:PythonApache-2.0010

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:CMIT000