jkoprax

jkoprax

Geek Repo

Github PK Tool:Github PK Tool

jkoprax's starred repositories

You-Only-Speak-Once

Deep Learning - one shot learning for speaker recognition using Filter Banks

Language:Jupyter NotebookStargazers:146Issues:0Issues:0
Language:PythonLicense:MITStargazers:2Issues:0Issues:0

AVMFN-For-Person-Verification

Bimodal Adaptive Feature Fusion Network for Person Verification

Language:PythonStargazers:17Issues:0Issues:0

vue-audio-visual

VueJS audio visualization components

Language:TypeScriptLicense:MITStargazers:700Issues:0Issues:0
Language:TypeScriptLicense:MITStargazers:8Issues:0Issues:0

open-interpreter

A natural language interface for computers

Language:PythonLicense:AGPL-3.0Stargazers:51237Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7063Issues:0Issues:0

tf-kaldi-speaker

Neural speaker recognition/verification system based on Kaldi and Tensorflow

Language:PythonLicense:Apache-2.0Stargazers:32Issues:0Issues:0

VBx

Variational Bayes HMM over x-vectors diarization

Language:PythonStargazers:246Issues:0Issues:0

awesome-kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

License:MITStargazers:533Issues:0Issues:0

llama-fs

A self-organizing file system with llama 3

Language:Jupyter NotebookLicense:MITStargazers:4665Issues:0Issues:0

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:34039Issues:0Issues:0

suno-api

Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.

Language:TypeScriptLicense:LGPL-3.0Stargazers:1029Issues:0Issues:0

UdioWrapper

UdioWrapper is a Python package that enables the generation of music tracks using Udio's API through textual prompts. This package is based on the reverse engineering of the Udio API (https://www.udio.com/) and is not officially endorsed by Udio.

Language:PythonLicense:MITStargazers:89Issues:0Issues:0

EEND

End-to-End Neural Diarization

Language:PythonLicense:MITStargazers:360Issues:0Issues:0

SpectralCluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

Language:PythonLicense:Apache-2.0Stargazers:502Issues:0Issues:0

transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary

Language:PythonLicense:GPL-3.0Stargazers:648Issues:0Issues:0

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonLicense:Apache-2.0Stargazers:978Issues:0Issues:0

diart

A python package to build AI-powered real-time audio applications

Language:PythonLicense:MITStargazers:932Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5626Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8175Issues:0Issues:0

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Language:C++License:MPL-2.0Stargazers:24862Issues:0Issues:0

opensmile

The Munich Open-Source Large-Scale Multimedia Feature Extractor

Language:C++License:NOASSERTIONStargazers:553Issues:0Issues:0

pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Language:PythonStargazers:2359Issues:0Issues:0

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:13994Issues:0Issues:0

accent_rating

A collection of scripts and data I used when working on my dissertation

Language:PythonLicense:GPL-3.0Stargazers:1Issues:0Issues:0

idvoice-gpt-android-demo

IDVoice + ChatGPT Android demo app

Language:KotlinLicense:MITStargazers:1Issues:0Issues:0

FishBoardMix

The FishBoardMix corpus is designed to explore Speaker-Age estimation technology.

Language:ShellLicense:Apache-2.0Stargazers:2Issues:0Issues:0
Language:PythonStargazers:3Issues:0Issues:0

idvoice-gpt-ios-demo

IDVoice + ChatGPT iOS demo app

Language:SwiftLicense:MITStargazers:6Issues:0Issues:0