icassp

There are 0 repository under icassp topic.

gabrielmittag / NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
speech-quality deep-learning interspeech icassp tts pytorch voice-conversion text-to-speech speech-synthesis quality-of-experience
Language:Python 851
DmitryRyumin / ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
asr denoising domain-adaptation face-recognition icassp icassp2023 keyword-spotting language-modeling self-supervised-learning semantic-segmentation signal-processing signal-restoration speech-recognition vad generative-models image-generation music-generation spoken-language-understanding multimodal-learning icassp2024
Language:Python 503
Text2Video
sibozhang / Text2Video
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".
vid2vid video gan metaverse deep-learning avatar virtual-humans aigc digital-humanities generative-ai speech-synthesis text-to-video tts talking talking-face-generation talking-head talking-heads icassp
Language:Python 436
IBM / TabFormer
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
machine-learning artificial-intelligence credit-card-dataset fraud-detection gpt bert tabular-data prsa-dataset huggingface credit-card-transaction transformer pytorch icassp icassp2021
Language:Python 334
soham97 / awesome-sound_event_detection
Reading list for research topics in Sound AI
audio-processing icassp interspeech sound-event-detection acoustic-scene-classification audio-captioning audio-generation audio-retrieval representation-learning zero-shot-learning
180
Jiaxin-Ye / TIM-Net_SER
[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".
casia emodb emotion-recognition iemocap ravdess savee speech-emotion-recognition emovo bi-directional icassp
Language:Python 178
glam-imperial / EmotionalConversionStarGAN
This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".
generative-adversarial-network stargan stargan-vc data-augmentation emotion-recognition speech-synthesis deep-learning deep-neural-networks icassp-2020 icassp imperial-college-london augsburg-university imperial-glam
Language:Python 132
DmitryRyumin / NewEraAI-Papers
The repository provides links to collections of influential and interesting research papers from top AI conferences, with open-source code to promote reproducibility and provide detailed implementation insights beyond the scope of the article. Stay up to date with the latest advances in AI research!
artificial-intelligence computer-vision cvpr deep-learning emnlp icassp iccv image-processing interspeech ismir mashine-learning natural-language-processing neural-networks signal-processing text-classification video-processing
Language:Python 112
XuesongYang / end2end_dialog
ICASSP2017: End-to-end joint learning of natural language understanding and dialogue manager
icassp
Language:Python 75
fonfonx / FaceRecognition
Face Recognition in real-world images [ICASSP 2017]
face-recognition rsc sparse-coding real-world-images landmarks python opencv lfw icassp
Language:Python 38
30stomercury / Interaction-Aware-Attention-Network
[ICASSP19] An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs
emotion-recognition icassp icassp-2019 speech-emotion-recognition tensorflow
Language:Python 35
doheejin / HiPAMA
This repository is the implementation of the HiPAMA architecture, introduced in the paper, Hierarchical Pronunciation Assessment with Multi-Aspect Attention (ICASSP 2023).
assessment icassp2023 pronunciation pronunciation-scoring speech-processing automatic-pronunciation-assessment capt language-learning nlp apa icassp
Language:Python 34
eleGAN23 / QVAE
Official PyTorch implementation of A Quaternion-Valued Variational Autoencoder (QVAE).
vae vae-pytorch quaternions generative-models variational-autoencoder quaternion-neural-networks pytorch-vae variational-autoencoders entropy entropy-measures quaternion-domain vaes icassp
Language:Python 29
Neclow / SERAB
SERAB: a multi-lingual benchmark for speech emotion recognition
benchmark byol byol-a deep-learning emotion-recognition pytorch scikit-learn self-supervised-learning speech-processing icassp
Language:Python 28
monetjoe / latex_templates
LaTeX templates for papers, please select your conference or journal by switching branches.
icme csmt icassp ismir eurasip
Language:TeX 26
orbxball / icassp2019-latex-template
ICASSP 2019 official Latex template
icassp icassp-2019 ieee acoustics speech signal-processing conference latex
Language:TeX 24
choyingw / SCADC-DepthCompletion
ICASSP 2021: Scene Completeness-Aware Lidar Depth Completion for Driving Scenario
icassp2021 icassp 3d depth-completion depth-estimation stereo-vision lidar scene-reconstruction autonomous-driving autonomous-vehicles computer-vision deep-neural-networks
Language:Python 18
huangyz0918 / kws-continual-learning
Continual Learning Benchmark for Spoken Keyword Spotting
deep-learning keyword-extraction keyword-spotting icassp
Language:Python 16
koudounasalkis / voc2vec
This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.
audio-classification foundation-models icassp non-verbal-vocalisation open-source pre-training
Language:Python 14
yousefkotp / Flare-Free-Vision-Empowering-Uformer-with-Depth-Insights
The official implementation for IEEE-ICASSP 2024 paper "Flare-Free Vision: Empowering Uformer with Depth Insights"
deep-learning depth-estimation depth-map flare-free flare-removal icassp icassp2024 ieee-icassp image-enhancement image-enhancing image-processing image-restoration neural-networks u-shaped-transformer
Language:Python 14
kjw11 / Speaker-Aware-CTC
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
asr ctc icassp sactc
Language:Python 13
meowoodie / Regularized-RBM
A regularized version of RBM for unsupervised feature selection.
machine-learning statistics python data-mining embeddings skipgram tensorflow rbm events l1 regularization variable-selection unsupervised-learning feature-selection icassp-2018 icassp
Language:Python 13
KrishnaswamyLab / ImageFlowNet
[ICASSP 2025 Oral] ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
disease-progression icassp icassp2025 image-forecasting image-prediction medical-image-analysis neural-ode pytorch spatial-temporal time-series time-series-forecasting trajectory-prediction unet
Language:Python 12
seorim0 / ResUNet-LC
2D residual U-Net (ResUNet) and a lead combiner (LC) for 12-lead ECG Abnormality Classification
abnormal-detection abnormality-detection classification deep-learning deep-neural-networks dnn ecg ecg-classification electrocardiogram multi-label-classification pytorch resnet icassp icassp2024
Language:Python 11
SMIL-SPCRAS / DAVIS
Official repo for "Audio-Visual Speech Recognition In-the-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-based Method" in ICASSP 2024
attention-mechanism audio-visual avsr corpus icassp icassp2024 in-the-wild multi-modal signal-processing spatio-temporal-features speech-recognition
Language:JavaScript 9
ChenLiu-1996 / ImageFlowNet
[ICASSP 2025] ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
disease-progression icassp icassp2025 image-forecasting image-prediction neural-ode pytorch trajectory-prediction unet differential-equations latent-space medical-image-analysis medical-imaging time-series-forecasting
Language:Python 8
kwatcharasupat / directional-sparse-filtering-tf
Python Implementation for Directional Sparse Filtering with Tensorflow/Keras
signal-processing dsp blind-source-separation unsupervised-learning speech-separation icassp icassp-2017 icassp-2021
Language:Python 8
Factral / PrivDL
code for the paper: PRIVACY-PRESERVING DEEP LEARNING: LEVERAGING DEFORMABLE OPERATORS FOR SECURE TASK LEARNING
deep-learning icassp privacy privacy-deep-learning privacy-preserving privacy-preserving-deep-learning shuffle
Language:Python 6
koudounasalkis / divergence-in-speech-systems
Code associated with the paper "Exploring Subgroup Performance in End-to-End Speech Models", accepted at ICASSP 2023
divergence explainable-ai icassp interpretable-machine-learning spoken-language-understanding subgroup-identification transformers
Language:Jupyter Notebook 6
sungjae-cho / ICASSP2020_STDemo
Show and Tell demonstration homepage
tts emotion icassp-2020 icassp show-and-tell text-to-speech transfer-learning deep-learning
Language:HTML 4
koudounasalkis / Data-Acquisition-for-Speech-Model-Improvement
This repo contains the code for "Prioritizing Data Acquisition For End-to-End Speech Model Improvement", accepted at ICASSP 2024
bias-mitigation data-acquisition data-selection explainable-ai icassp interpretable-machine-learning model-improvement multilingual spoken-language-understanding transformers
Language:Jupyter Notebook 3
koudounasalkis / Subgroup-Analysis-in-Speech-Models
This repo contains the code for "Towards Comprehensive Subgroup Performance Analysis in Speech Models"
automatic-speech-recognition emotion-recognition explainable-ai icassp interpretable-machine-learning model-bias-analysis spoken-language-understanding subgroup-identification taslp transformers
Language:Jupyter Notebook 3
xieh97 / contrastive-negative-sampling
Source code for negative sampling for contrastive audio-text retrieval (ICASSP 2023)
contrastive-learning icassp negative-sampling multimodal-learning audio-information-retrieval audio-language-learning
Language:Python 3
CostasAK / icassp2023
Jupyter Notebook associated with our submission for the 2023 ICASSP, "Sensor Selection for Angle of Arrival Estimation Based on the Two-Target Cramér-Rao Bound"
icassp icassp2023 jupyter jupyter-notebook notebook angle-of-arrival array-processing multi-target sensor-selection signal-processing sparse-sensing
Language:Jupyter Notebook 2
hahnec / stofnet
StofNet: Super-resolution Time of Flight Network (ICASSP 2024)
acoustic audio deep-learning icassp icassp2024 learning localization multilateration neural non-destructive-testing round-trip super-resolution time-of-arrival time-of-flight tof trilateration ultrasound
Language:Python 2
testzer0 / SpeakerVerification
My implementation of "Generalized End-to-End Loss for Speaker Verification" (ICASSP 2018)
bilstm contrastive-loss deep-learning icassp pytorch speaker-verification
Language:Jupyter Notebook 2

icassp

gabrielmittag / NISQA

DmitryRyumin / ICASSP-2023-24-Papers

sibozhang / Text2Video

IBM / TabFormer

soham97 / awesome-sound_event_detection

Jiaxin-Ye / TIM-Net_SER

glam-imperial / EmotionalConversionStarGAN

DmitryRyumin / NewEraAI-Papers

XuesongYang / end2end_dialog

fonfonx / FaceRecognition

30stomercury / Interaction-Aware-Attention-Network

doheejin / HiPAMA

eleGAN23 / QVAE

Neclow / SERAB

monetjoe / latex_templates

orbxball / icassp2019-latex-template

choyingw / SCADC-DepthCompletion

huangyz0918 / kws-continual-learning

koudounasalkis / voc2vec

yousefkotp / Flare-Free-Vision-Empowering-Uformer-with-Depth-Insights

kjw11 / Speaker-Aware-CTC

meowoodie / Regularized-RBM

KrishnaswamyLab / ImageFlowNet

seorim0 / ResUNet-LC

SMIL-SPCRAS / DAVIS

ChenLiu-1996 / ImageFlowNet

kwatcharasupat / directional-sparse-filtering-tf

Factral / PrivDL

koudounasalkis / divergence-in-speech-systems

sungjae-cho / ICASSP2020_STDemo

koudounasalkis / Data-Acquisition-for-Speech-Model-Improvement

koudounasalkis / Subgroup-Analysis-in-Speech-Models

xieh97 / contrastive-negative-sampling

CostasAK / icassp2023

hahnec / stofnet

testzer0 / SpeakerVerification