hp11223344

hp11223344

Geek Repo

0

followers

0

following

Github PK Tool:Github PK Tool

hp11223344's starred repositories

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8296Issues:130Issues:1054

Self-Attention-GAN

Pytorch implementation of Self-Attention Generative Adversarial Networks (SAGAN)

DeepFilterNet

Noise supression using deep filtering

Language:PythonLicense:NOASSERTIONStargazers:2216Issues:32Issues:268

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonLicense:MITStargazers:1385Issues:44Issues:220

madmom

Python audio and music signal processing library

Language:PythonLicense:NOASSERTIONStargazers:1283Issues:43Issues:264

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1151Issues:29Issues:135

awesome-speech-enhancement

speech enhancement\speech seperation\sound source localization

FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Language:PythonLicense:MITStargazers:527Issues:10Issues:60

Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

triplet-attention

Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]

Language:Jupyter NotebookLicense:MITStargazers:396Issues:10Issues:26

ConditionalDETR

This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org/abs/2108.06152)

Language:PythonLicense:Apache-2.0Stargazers:352Issues:8Issues:33

phasen

A unofficial Pytorch implementation of Microsoft's PHASEN

DB-AIAT

The implementation of "Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement"

Language:PythonLicense:MITStargazers:113Issues:3Issues:9

MECT4CNER

Code for ACL 2021 paper. MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition.

GaGNet

This repo provides the network code and the processed samples of the manuscript "Glance and Gaze: A Collaborative Learning Framework for Single-channel Speech Enhancement", which was accepted by Elsevier Applied Acoustics.

speech-emotion-recognition-using-self-attention

Implementation of the paper "Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning" From INTERSPEECH 2019

DSA2F

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

unsup_speech_enh_adaptation

Unsupervised domain adaptation for conversational speech enhancement using RemixIT

Language:Jupyter NotebookLicense:MITStargazers:51Issues:3Issues:5

MFNet

This repo provides the processed samples of the manuscript "a Mask Free Neural Network for Monaural Speech Enhancement", which was accepted by INTERSPEECH2023.

DBT-Net

The audio demos with respect to the paper "DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement" are provided (submitted to TASLP). The code will also be released soon.

NUNet-TLS

Nested U-Net with two-level skip connections for speech enhancement

Language:PythonLicense:MITStargazers:25Issues:1Issues:3

LSA

Ablation study of local spectral attention (LSA) for full-band speech enhancement (SE)

Language:PythonLicense:MITStargazers:24Issues:1Issues:3
Language:PythonLicense:CC0-1.0Stargazers:10Issues:0Issues:0

DOA-estimation-with-a-stacked-self-attention-network

A stacked self-attention network for two-dimensional direction-of-arrival estimation in hands-free speech communication

Language:PythonStargazers:9Issues:1Issues:0

Speech-Enhancement-Using-Time-Domain-Loss

This is an adaptation of the paper "Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement". It uses Time Domain Reconstruction (TDR) as an additional loss function to make use of clean phase in the enhancement process. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6519714/

Language:PythonStargazers:8Issues:2Issues:0
Language:PythonStargazers:8Issues:0Issues:0

Phase-Aware-Deep-Speech-Enhancement

Phase Aware Deep Speech Enhancement - Pytorch

Language:PythonStargazers:6Issues:1Issues:0

Relative-Phase-Shift

Try RPS drawing.

Language:Jupyter NotebookStargazers:1Issues:1Issues:0