CZ26's repositories
CycleTransGAN-EVC
CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer
FaceSwapping
Face swapping function with Paper: Motion Representations for Articulated Animation
AudioCLIP
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
avatarface_implement
Face Swapping
character-mining
Mining individual characters in multiparty dialogue
controllable_evc_code
This is the code for controllable EVC framework for seen and unseen emotion generation.
dataset_medical
医学影像数据集列表 『An Index for Medical Imaging Datasets』
Depression_FAU-guided
Depression_FAU-guided
dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
facial-landmark-frontalization
Function to frontalize non-frontal 2D facial landmarks generated from the DLIB library
icassp2021-emotion-tts
Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/
ICE-Talk
Interface for Controllable Expressive Talking Machine
nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
phonemizer
Simple text to phones converter for multiple languages
PythonPark
Python 开源项目之「自学编程之路」,保姆级教程:AI实验室、宝藏视频、数据结构、学习指南、机器学习实战、深度学习实战、网络爬虫、大厂面经、程序人生、资源分享。
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
remote-opencv-streaming-live-video
A remote live video streaming connection with Flask
segmentation-kit
Speech Segmentation Toolkit using Julius
SKAIG-ERC
The code for "Past, Present, and Future: Conversational Emotion Recognition through Structural Modeling of Psychological Commonsense Knowledge" plus the code of models in "A Hierarchical Transformer with Speaker Modeling for Emotion Recognition in Conversations"
Transformer-TTS
A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"
video_features
Extract video features from raw videos using multiple GPUs. We support RAFT and PWC flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.
VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
XrayGLM
🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.