shen912zzz's starred repositories
google-research
Google Research
VoiceprintRecognition-Pytorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
pyloudnorm
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
SAB-cnn-audio-denoiser
Tensorflow 2.0 implementation of the paper: A Fully Convolutional Neural Network for Speech Enhancement
audio_dataset_screener
An auxiliary tool for manual screening of audio dataset.
audio_dataset_vpr
A voiceprint recognition classifier for audio dataset
Urbansound8k
Sound Classification using Librosa, ffmpeg, CNN, Keras, XGBOOST, Random Forest.
Audio-Classification-using-CNN-MLP
Multi class audio classification using Deep Learning (MLP, CNN): The objective of this project is to build a multi class classifier to identify sound of a bee, cricket or noise.
STgram-MFN
A spectro-temporal fusion feature, STgram, with MobileFaceNet For more stable Anomalous Sound Detection
environmental-sound-classification
Environmental sound classification with Convolutional neural networks and the UrbanSound8K dataset.
Soundscapy
A python library for soundscape assessments
SciDataTool
SciDataTool is an open-source Python package for scientific data handling. The objective is to provide a user-friendly, unified, flexible module to postprocess any kind of signal. It is meant to be used by researchers, R&D engineers and teachers in any scientific area. This package allows to efficiently store data fields in the time/space or in the frequency domain, to easily perform Fourier Transforms, to extract slices, to convert units, to compare several fields, etc. It therefore leads to simplified plot commands.
human-voice-detection
Binary classification problem that aims to classify human voices from audio recordings. Implemented using PyTorch and Librosa.
pyAudioKits
Powerful Python audio workflow support based on librosa and other libraries
Add_noise_and_rir_to_speech
The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT Speech@FIT Reverb Database.
bird_audio_detection_challenge
DenseNets for the detection of singing birds in audio files
dcase2022
Submission for task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques" of the DCASE challenge 2022 (https://dcase.community/challenge2022/task-unsupervised-anomalous-sound-detection-for-machine-condition-monitoring).
sub-cluster-AdaCos
Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.
multimodal-dl-framework
An extensible PyTorch framework to experiment with neural-networks-based deep learning algorithms on multiple data modalities for binary classification.
Human-Pose-Estimation---Motion-Capture-Device
Inertial Human Motion Capture Device - Submodule with GY-87 for Pose Data Acquisition, ESP32 for Pose Estimation, and UDP Connection to PC for Pose Reconstruction
Animal-Sound-Classifier-using-Watson-Studio
Build classification models using IBM Watson Studio to predict (identify) animal sounds. Learn how to best gather and prepare data, create and deploy models, deploy and test a signal processing application, create models with binary classifications, and display the predictions on a web page created using Node-RED.
AudioEventLabeller
EchoMarks: Dataset Annotation for Audio Event Detection