Pingchuan Ma's starred repositories
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
faster-whisper
Faster Whisper transcription with CTranslate2
AgentVerse
🤖 AgentVerse 🪐 is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides two frameworks: task-solving and simulation
INTERSPEECH-2023-Papers
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
CVPR-2023-24-Papers
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
MegaPortraits
Supplementary materials for paper MegaPortraits [ACMM22]
DSFD-Pytorch-Inference
A High-Performance Pytorch Implementation of face detection models, including RetinaFace and DSFD
Depth-Enhancement-and-Super-Resolution
Towards Unpaired Depth Enhancement and Super-Resolution in the Wild paper code
Leaf-diseases-segmentation
Finale project of Deep Learning course
LipLearner
Research repository for LipLearner: Customizable Silent Speech Interactions on Mobile Devices (CHI 2023).
Lenta-Hackathon
Code and files for skoltech/lenta hackaton sept.2020
AV-RelScore
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23
Multi-head-Visual-Audio-Memory
PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)
CNVSRC2023Baseline
Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)
papers-to-read
Main articles I read or plan to read, as well as useful links.
skoltech_NLA
Numerical linear algebra course in Skoltech 2020