anton-jeran

followers

following

stars

UNIVERSITY OF MARYLAND COLLEGE PARK

USA

https://www.linkedin.com/in/anton-jeran-ratnarajah-78663099/

Organizations

GAMMA-UMD

Anton Jeran Ratnarajah's starred repositories

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

Language:PythonNOASSERTION9500

see2sound

Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Language:PythonApache-2.03800

MESH2IR

This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh.

Language:Python7100

FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Language:PythonAGPL-3.014200

EyefulTower

Official release of the Eyeful Tower dataset, a high-fidelity multi-view capture of 11 real-world scenes, from the paper “VR-NeRF High-Fidelity Virtualized Walkable Spaces” (Xu et al., SIGGRAPH Asia 2023).

NOASSERTION13500

L2S

This is the official implementation of our end-to-end binaural audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications.

Language:Python400

diffroomacoustics

A Differentiable Room Acoustics Simulator

Language:PythonMIT400

FastDiff

PyTorch Implementation of FastDiff (IJCAI'22)

Language:Python39600

lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Language:PythonApache-2.0588800

real-acoustic-fields

Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark

NOASSERTION3400

AVRIR

Language:JavaScript100

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonBSD-4-Clause995200

languagecodec_tmp

Temporary anonymous version

Language:PythonApache-2.02300

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonNOASSERTION253600

LRV-Instruction

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Language:PythonBSD-3-Clause23300

fcn.berkeleyvision.org

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR 2015 and PAMI 2016.

Language:Python329500

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonBSD-3-Clause258000

stable-audio-tools

Generative models for conditional audio generation

Language:PythonMIT225100

Catch-A-Waveform

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

Language:PythonNOASSERTION18100

awesome-vision-language-pretraining-papers

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Language:PythonMIT35000

Fast3DScattering-release

Repo for our research paper "Learning Acoustic Scattering Fields for Dynamic Interactive Sound Propagation"

Language:MATLAB1400

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Language:PythonNOASSERTION260100

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

NOASSERTION100

M2IR

Language:JavaScript100

S2IR

Language:SCSS100

Listen2Scene_Code

100

anton-jeran

100

M2SYN

Language:JavaScript100

MAD

Language:SCSS100