demonstan

demonstan

Geek Repo

Location:US

Github PK Tool:Github PK Tool

demonstan's starred repositories

deepMAR-Lite

Multi-attribute recognition net in an updated and containerised PyTorch version

Language:PythonStargazers:7Issues:0Issues:0

pedestrian-attribute-recognition-pytorch

A simple baseline for pedestrian attribute recognition in surveillance scenarios

Language:PythonStargazers:327Issues:0Issues:0
Language:C++Stargazers:244Issues:0Issues:0

speaker-id

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Language:PythonLicense:Apache-2.0Stargazers:335Issues:0Issues:0

speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Language:HTMLStargazers:360Issues:0Issues:0

portaudio

PortAudio is a cross-platform, open-source C language library for real-time audio input and output.

Language:CLicense:NOASSERTIONStargazers:1387Issues:0Issues:0

vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7489Issues:0Issues:0

kfr

Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

Language:C++License:GPL-2.0Stargazers:1624Issues:0Issues:0

Low-Latency-Android-iOS-Linux-Windows-tvOS-macOS-Interactive-Audio-Platform

🇸Superpowered Audio, Networking and Cryptographics SDKs. High performance and cross platform on Android, iOS, macOS, tvOS, Linux, Windows and modern web browsers.

Language:C++Stargazers:1329Issues:0Issues:0

essentia

C++ library for audio and music analysis, description and synthesis, including Python bindings

Language:C++License:AGPL-3.0Stargazers:2773Issues:0Issues:0

libsndfile

A C library for reading and writing sound files containing sampled audio data.

Language:CLicense:LGPL-2.1Stargazers:1389Issues:0Issues:0

r8brain-free-src

High-quality pro audio resampler / sample rate converter C++ library. Very fast, for both audio resampling and time-series interpolation.

Language:C++License:MITStargazers:550Issues:0Issues:0

libsamplerate

An audio Sample Rate Conversion library

Language:CLicense:BSD-2-ClauseStargazers:581Issues:0Issues:0

SFML

Simple and Fast Multimedia Library

Language:C++License:ZlibStargazers:9864Issues:0Issues:0

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Language:C++License:MPL-2.0Stargazers:24847Issues:0Issues:0

AudioFile

A simple C++ library for reading and writing audio files.

Language:C++License:MITStargazers:933Issues:0Issues:0

NumCpp

C++ implementation of the Python Numpy library

Language:C++License:MITStargazers:3484Issues:0Issues:0

nlpaug

Data augmentation for NLP

Language:Jupyter NotebookLicense:MITStargazers:4366Issues:0Issues:0

voicefilter

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Language:PythonStargazers:1061Issues:0Issues:0

VoiceIdentityBook

《声纹技术:从核心算法到工程实践》

Stargazers:148Issues:0Issues:0

uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Language:PythonLicense:Apache-2.0Stargazers:1548Issues:0Issues:0

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

License:Apache-2.0Stargazers:1523Issues:0Issues:0

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonLicense:Apache-2.0Stargazers:2689Issues:0Issues:0

DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Language:PythonLicense:CC-BY-4.0Stargazers:1034Issues:0Issues:0

vedadet

A single stage object detection toolbox based on PyTorch

Language:PythonLicense:Apache-2.0Stargazers:497Issues:0Issues:0

py-webrtcvad

Python interface to the WebRTC Voice Activity Detector

Language:CLicense:NOASSERTIONStargazers:1965Issues:0Issues:0

MS-SNSD

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.

Language:HTMLLicense:MITStargazers:460Issues:0Issues:0

deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.

Language:PythonLicense:MITStargazers:897Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5607Issues:0Issues:0

pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Language:PythonLicense:MITStargazers:303Issues:0Issues:0