UNBSQY's repositories
multimodal-audio-visual-speech-recognition
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
audio-visual-Speech-Enhancement-and-Separation
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
awesome-audio-visual
A curated list of different papers and datasets in various areas of audio-visual processing
awesome-audiovisual-learning
A curated list of audio-visual learning methods and datasets.
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
awesome_multimodal_paper
Reading list for research topics in multimodal machine learning
jalammar.github.io
Build a Jekyll blog in minutes, without touching the command line.
chartgpt_Api_chatbox
Your Ultimate Copilot on the Desktop. Chatbox is a desktop app for GPT-4 / GPT-3.5 (OpenAI API) that supports Windows, Mac & Linux.
CVPR-with-Code
CVPR 2023 论文和开源项目合集
OpenTransformer
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
pytorch-visualization
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
QySong.github.io
Homepage
seaborn_visual
Statistical data visualization in Python
transformers-tool
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.