owuQQQ's repositories
pymarl_transformers
Official repository of the paper TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems (AAMAS 2023)
FSDA
Flexible Statistics and Data Analysis (FSDA) extends MATLAB for a robust analysis of data sets affected by different sources of heterogeneity. It is open source software licensed under the European Union Public Licence (EUPL). FSDA is a joint project by the University of Parma and the Joint Research Centre of the European Commission.
Life-lessons
A dataset of first-person monologue videos/transcript/annotations about "life lessons" in various domains. The main purpose is for multi-modal language analysis and modeling.
word-discovery
Word Discovery in Visually Grounded, Self-Supervised Speech Models
cs-video-courses
List of Computer Science courses with video lectures.
deel-learning-course
code for deep learning courses
EGG
EGG: Emergence of lanGuage in Games
fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
GreedyCAS
code and data for EMNLP paper "Unsupervised Scientific Abstract Segmentation with Mutual Information"
HMAD
Head Movements (and facial movements) Automatic Detection
LLMs_interview_notes
该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题
Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
Multimodal-Aphasia-Type-Detection_EMNLP_2023
This codebase contains the python scripts for the model for the "Learning Co-Speech Gesture for Multimodal Aphasia Type Detection (EMNLP 2023) ".
nova
NOVA is a tool for annotating and analyzing behaviours in social interactions. It supports Annotators using Machine Learning already during the coding process. Further it features both, discrete labels and continuous scores and a visuzalization of streams recorded with the SSI Framework.
object-aware-gaze-target-detection
Official repo of the paper "Object-aware Gaze Target Detection" (ICCV 2023)
Only-Noisy-Training
A self-supervised speech denoising strategy named Only-Noisy Training (ONT), which solves the speech denoising problem with only noisy audio signals in audio space for the first time.
OptML_course
EPFL Course - Optimization for Machine Learning - CS-439
potato
potato: portable text annotation tool
pwesuite
Suite for phonetic word embeddings, especially their evaluation and baseline models.
speech2properties2gestures
We propose a new framework for gesture generation, aiming to allow data-driven approaches to produce more semantically rich gestures.
vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
vqvib_neurips2022
Codebase for VQ-VIB implementation and color experiments based on "Trading off Utility, Informativeness, and Complexity in Emergent Communication" NeurIPS 2022
WhisperSeg
Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection