Puyuan Peng's repositories
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
syllable-discovery
Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
word-discovery
Word Discovery in Visually Grounded, Self-Supervised Speech Models
FaST-VGS-Family
Transformer-based visually grounded speech models
moment_detr
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
academicpages
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
HERO_Video_Feature_Extractor
Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
MAE-AST-Public
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
para-nmt-50m
Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations"
zerospeech2021_baseline
BERT and LSTM baseline models of the ZeroSpeech Challenge 2021