There are 0 repository under icassp topic.
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
Reading list for research topics in Sound AI
[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".
This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".
The repository provides links to collections of influential and interesting research papers from top AI conferences, with open-source code to promote reproducibility and provide detailed implementation insights beyond the scope of the article. Stay up to date with the latest advances in AI research!
ICASSP2017: End-to-end joint learning of natural language understanding and dialogue manager
Face Recognition in real-world images [ICASSP 2017]
[ICASSP19] An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs
This repository is the implementation of the HiPAMA architecture, introduced in the paper, Hierarchical Pronunciation Assessment with Multi-Aspect Attention (ICASSP 2023).
Official PyTorch implementation of A Quaternion-Valued Variational Autoencoder (QVAE).
LaTeX templates for papers, please select your conference or journal by switching branches.
ICASSP 2019 official Latex template
ICASSP 2021: Scene Completeness-Aware Lidar Depth Completion for Driving Scenario
Continual Learning Benchmark for Spoken Keyword Spotting
This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.
The official implementation for IEEE-ICASSP 2024 paper "Flare-Free Vision: Empowering Uformer with Depth Insights"
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
A regularized version of RBM for unsupervised feature selection.
[ICASSP 2025 Oral] ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
2D residual U-Net (ResUNet) and a lead combiner (LC) for 12-lead ECG Abnormality Classification
Official repo for "Audio-Visual Speech Recognition In-the-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-based Method" in ICASSP 2024
[ICASSP 2025] ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
Python Implementation for Directional Sparse Filtering with Tensorflow/Keras
code for the paper: PRIVACY-PRESERVING DEEP LEARNING: LEVERAGING DEFORMABLE OPERATORS FOR SECURE TASK LEARNING
Code associated with the paper "Exploring Subgroup Performance in End-to-End Speech Models", accepted at ICASSP 2023
Show and Tell demonstration homepage
This repo contains the code for "Prioritizing Data Acquisition For End-to-End Speech Model Improvement", accepted at ICASSP 2024
This repo contains the code for "Towards Comprehensive Subgroup Performance Analysis in Speech Models"
Source code for negative sampling for contrastive audio-text retrieval (ICASSP 2023)
Jupyter Notebook associated with our submission for the 2023 ICASSP, "Sensor Selection for Angle of Arrival Estimation Based on the Two-Target Cramér-Rao Bound"
My implementation of "Generalized End-to-End Loss for Speaker Verification" (ICASSP 2018)