Nian (SaoYear)

SaoYear

Geek Repo

Company:Westlake University

Location:Hangzhou, China

Home Page:saoyear.github.io

Github PK Tool:Github PK Tool


Organizations
Audio-WestlakeU

Nian's starred repositories

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11902Issues:0Issues:0

RealMAN

A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization

Language:PythonStargazers:48Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:28242Issues:0Issues:0

dcase2023_task4b_baseline

Baseline code for DCASE 2023 task 4 B

Language:PythonStargazers:12Issues:0Issues:0

SAR-SSL

A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer”

Language:PythonLicense:MITStargazers:15Issues:0Issues:0

MetaAF

Control adaptive filters with neural networks.

Language:PythonStargazers:215Issues:0Issues:0

pytorch_misc

Code snippets created for the PyTorch discussion board

Language:PythonStargazers:540Issues:0Issues:0

LibMTL

A PyTorch Library for Multi-Task Learning

Language:PythonLicense:MITStargazers:1887Issues:0Issues:0

byol-a

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Language:PythonLicense:NOASSERTIONStargazers:203Issues:0Issues:0

SaProt

[ICLR'24 spotlight] Saprot: Protein Language Model with Structural Alphabet

Language:PythonLicense:MITStargazers:289Issues:0Issues:0

SED_SoftLabel

Sound Event Classification With Soft Label

Language:PythonStargazers:3Issues:0Issues:0

pb_sed

Paderborn Sound Event Detection

Language:PythonLicense:MITStargazers:68Issues:0Issues:0

OI-wiki

:star2: Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)

Language:TypeScriptStargazers:19828Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:27Issues:0Issues:0

HTS-Audio-Transformer

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Language:PythonLicense:MITStargazers:335Issues:0Issues:0

FS-EEND

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]

Language:PythonLicense:MITStargazers:72Issues:0Issues:0

RVAE-EM

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Language:PythonLicense:MITStargazers:35Issues:0Issues:0

ATST-RCT

ATST-RCT model for DCASE 2022 task4.

Language:PythonStargazers:2Issues:0Issues:0

RCT

This repo gives the code for the official implementation of RCT.

Language:PythonStargazers:12Issues:0Issues:0

ATST-SED

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Language:Jupyter NotebookLicense:MITStargazers:70Issues:0Issues:0

UMA-ASR

This repository is the official implementation of "Unimodal Aggregation for CTC-based Speech Recognition".

Language:ShellStargazers:12Issues:0Issues:0

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1077Issues:0Issues:0
Language:PythonLicense:MITStargazers:25Issues:0Issues:0
Language:PythonLicense:MITStargazers:75Issues:0Issues:0

FN-SSL

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization

Language:PythonStargazers:68Issues:0Issues:0

ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Language:PythonLicense:BSD-3-ClauseStargazers:358Issues:0Issues:0

ontology

The Audio Set Ontology aims to provide a comprehensive set of categories to describe sound events.

Stargazers:637Issues:0Issues:0

AudioSetOntologyTree

Tree visualization of the AudioSet Ontology - https://github.com/audioset/ontology

Language:HTMLStargazers:16Issues:0Issues:0

download_audioset

📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).

Language:PythonLicense:NOASSERTIONStargazers:97Issues:0Issues:0

McNet

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Language:PythonStargazers:95Issues:0Issues:0