Anton Jeran Ratnarajah's starred repositories
Speech2RIR
This is the official implementation of reverberant speech to room impulse response estimator
ears_dataset
Expressive Anechoic Recordings of Speech (EARS)
EyefulTower
Official release of the Eyeful Tower dataset, a high-fidelity multi-view capture of 11 real-world scenes, from the paper “VR-NeRF High-Fidelity Virtualized Walkable Spaces” (Xu et al., SIGGRAPH Asia 2023).
diffroomacoustics
A Differentiable Room Acoustics Simulator
real-acoustic-fields
Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark
languagecodec_tmp
Temporary anonymous version
LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
fcn.berkeleyvision.org
Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR 2015 and PAMI 2016.
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
stable-audio-tools
Generative models for conditional audio generation
Catch-A-Waveform
Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Fast3DScattering-release
Repo for our research paper "Learning Acoustic Scattering Fields for Dynamic Interactive Sound Propagation"
audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio