SteveTanggithub

followers

following

stars

Hangzhou Huacheng Network Technology

Hangzhou,China

@SteveTa57657898

Steve Tang/Yuwu Tang's starred repositories

MaPPER

200

LaMI-DETR

[ECCV 2024] Official implementation of "LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction"

Language:PythonApache-2.03700

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonBSD-3-Clause154300

NormKD

The official implementation of NormKD: Normalized Logits for Knowledge Distillation

Language:Jupyter NotebookNOASSERTION800

audioFlux

A library for audio and music analysis, feature extraction.

Language:CMIT278600

PaSST

Efficient Training of Audio Transformers with Patchout

Language:PythonApache-2.029900

bcresnet

Language:PythonBSD-3-Clause-Clear4600

cav-mae

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Language:PythonBSD-2-Clause22700

PSL

Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"

Language:PythonGPL-3.03000

SAT

Streaming Audiotransformers for online Audio tagging

Language:PythonGPL-3.04100

convit

Code for the Convolutional Vision Transformer (ConViT)

Language:PythonApache-2.046100

NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Language:CudaNOASSERTION35100

AudioTaggingDoneRight

experiments about AudioSet

Language:Jupyter NotebookNOASSERTION4300

GCT

Language:Python200

UIT_Mobile

Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"

Language:PythonGPL-3.02300

EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

Language:PythonMIT22400

gemmlab

Language:C++1700

resampler

A Simple and Efficient Audio Resampler Implementation in C

Language:CMIT13900

zita-resampler

Libzita-resampler is a C++ library for resampling audio signals. It is designed to be used within a real-time processing context, to be fast, and to provide high-quality sample rate conversion.

Language:C++GPL-3.02200

libfar

C/C++ fast audio resampling library

Language:CMIT4000

r8brain-free-src

High-quality pro audio resampler / sample rate conversion C++ library. Very fast, for both audio resampling and time-series interpolation.

Language:C++MIT57200

LibrosaCpp

LibrosaCpp is a c++ implemention of librosa to compute short-time fourier transform coefficients,mel spectrogram or mfcc

Language:C++Apache-2.018500

Spoken_language_identification

A TensorFlow-based spoken language identification

Language:PythonApache-2.08100

TCN

Sequence modeling benchmarks and temporal convolutional networks

Language:PythonMIT415400

melspectrogram_c

melspectrogram函数的c++实现

Language:C++GPL-3.0400

librosapp

A C++ implementation of stft, melspectrogram and mel_to_stft

Language:C++Unlicense800

MFCC

mfcc, mel, pcen. (librosa)

Language:C++3500

ODConv

The official project website of "Omni-Dimensional Dynamic Convolution" (ODConv for short, spotlight in ICLR 2022).

Language:PythonApache-2.028700

RaDur

The source code of RaDur

Language:Python300

Infant-Crying-Detection

Language:PythonMIT2400