Beast code in Giters

hp11223344's starred repositories

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08296 130 1054

Self-Attention-GAN

Pytorch implementation of Self-Attention Generative Adversarial Networks (SAGAN)

Language:Python2494 35 65

DeepFilterNet

Noise supression using deep filtering

Language:PythonNOASSERTION2216 32 268

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonMIT1385 44 220

madmom

Python audio and music signal processing library

Language:PythonNOASSERTION1283 43 264

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonApache-2.01151 29 135

awesome-speech-enhancement

speech enhancement\speech seperation\sound source localization

GPL-2.0964 43 1

FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Language:PythonMIT527 10 60

Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Language:Python402 6 54

triplet-attention

Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]

Language:Jupyter NotebookMIT396 10 26

ConditionalDETR

This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org/abs/2108.06152)

Language:PythonApache-2.0352 8 33

phasen

A unofficial Pytorch implementation of Microsoft's PHASEN

Language:Python217 9 13

DB-AIAT

The implementation of "Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement"

Language:PythonMIT113 3 9

MECT4CNER

Code for ACL 2021 paper. MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition.

67 3 30

This repo provides the network code and the processed samples of the manuscript "Glance and Gaze: A Collaborative Learning Framework for Single-channel Speech Enhancement", which was accepted by Elsevier Applied Acoustics.

Language:Python61 5 3

speech-emotion-recognition-using-self-attention

Implementation of the paper "Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning" From INTERSPEECH 2019

Language:Python58 2 6

DSA2F

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Language:Python58 3 11

unsup_speech_enh_adaptation

Unsupervised domain adaptation for conversational speech enhancement using RemixIT

Language:Jupyter NotebookMIT51 3 5

DDAEC

Language:Python39 2 3

MFNet

This repo provides the processed samples of the manuscript "a Mask Free Neural Network for Monaural Speech Enhancement", which was accepted by INTERSPEECH2023.

MIT34 4 5

DBT-Net

The audio demos with respect to the paper "DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement" are provided (submitted to TASLP). The code will also be released soon.

Language:Python28 1 1

NUNet-TLS

Nested U-Net with two-level skip connections for speech enhancement

Language:PythonMIT25 1 3

LSA

Ablation study of local spectral attention (LSA) for full-band speech enhancement (SE)

Language:PythonMIT24 1 3

WD-TCN

Language:PythonCC0-1.01000

DOA-estimation-with-a-stacked-self-attention-network

A stacked self-attention network for two-dimensional direction-of-arrival estimation in hands-free speech communication

Language:Python9 10

Speech-Enhancement-Using-Time-Domain-Loss

This is an adaptation of the paper "Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement". It uses Time Domain Reconstruction (TDR) as an additional loss function to make use of clean phase in the enhancement process. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6519714/

Language:Python8 20

hp11223344