marmoi

followers

following

stars

Tampere university

Tampere

https://marmoi.github.io

Irene Martín Morató's starred repositories

macs-captioning-start-token

Language:Python100

ssl4birdsounds

Self-supervised representation learning for bird sounds (ICASSPW SASB 2024)

Language:Python900

ATST-SED

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Language:Jupyter NotebookMIT6000

HTS-Audio-Transformer

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Language:PythonMIT32700

DCASE2021_task6_v2

Code for CVSSP submission to DCASE 2021 Task 6

Language:Jupyter Notebook3400

a-mask-guided-transformer-with-topic-token-for-remote-sensing-image-captioning

Language:Python800

TextToAudioGrounding

The dataset and baseline code for Text-to-Audio Grounding (TAG)

Language:PythonMIT3400

whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Language:PythonBSD-2-Clause28600

aac-datasets

Audio Captioning datasets for PyTorch.

Language:PythonMIT8900

transformer_workshop

Code for the Transformer workshop

Language:Jupyter Notebook400

audio-and-speech-tech-2022

Audio and Speech Technologies Workshop 2022, code examples

Language:ShellMIT400

kapre

kapre: Keras Audio Preprocessors

Language:PythonMIT91600

netron

Visualizer for neural network, deep learning and machine learning models

Language:JavaScriptMIT2674300

interpretable_predictions

Interpretable Neural Predictions with Differentiable Binary Variables

Language:PythonMIT8500

byol-a

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Language:PythonNOASSERTION20200

wiki

This repo contains the source code for the deployment of the unofficial crowdsourced wiki for the Faculty of Information Technology and Communication Sciences at Tampere University.

Unlicense300

sed_eval

Evaluation toolbox for Sound Event Detection

Language:PythonMIT13600

dcase_util

A collection of utilities for Detection and Classification of Acoustic Scenes and Events

Language:PythonMIT13000

sed_vis

Visualization toolbox for Sound Event Detection

Language:PythonMIT10600

fense

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

Language:Python1600

pytorchforaudio

Code for the "PyTorch for Audio + Music Processing" series on The Sound of AI YouTube channel.

Language:PythonMIT22700

soundata

Python library for downloading, loading & working with sound datasets

Language:PythonBSD-3-Clause28000

dcase_datalist

Collection of DCASE related datasets

Language:HTMLMIT1300