Anton Jeran Ratnarajah (anton-jeran)

anton-jeran

Geek Repo

Company:UNIVERSITY OF MARYLAND COLLEGE PARK

Location:USA

Home Page:https://www.linkedin.com/in/anton-jeran-ratnarajah-78663099/

Twitter:@AntonJeran

Github PK Tool:Github PK Tool


Organizations
GAMMA-UMD

Anton Jeran Ratnarajah's starred repositories

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

Language:PythonLicense:NOASSERTIONStargazers:95Issues:0Issues:0

see2sound

Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Language:PythonLicense:Apache-2.0Stargazers:38Issues:0Issues:0

MESH2IR

This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh.

Language:PythonStargazers:71Issues:0Issues:0

FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Language:PythonLicense:AGPL-3.0Stargazers:142Issues:0Issues:0

EyefulTower

Official release of the Eyeful Tower dataset, a high-fidelity multi-view capture of 11 real-world scenes, from the paper “VR-NeRF High-Fidelity Virtualized Walkable Spaces” (Xu et al., SIGGRAPH Asia 2023).

License:NOASSERTIONStargazers:135Issues:0Issues:0

L2S

This is the official implementation of our end-to-end binaural audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications.

Language:PythonStargazers:4Issues:0Issues:0

diffroomacoustics

A Differentiable Room Acoustics Simulator

Language:PythonLicense:MITStargazers:4Issues:0Issues:0

FastDiff

PyTorch Implementation of FastDiff (IJCAI'22)

Language:PythonStargazers:396Issues:0Issues:0

lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Language:PythonLicense:Apache-2.0Stargazers:5888Issues:0Issues:0

real-acoustic-fields

Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark

License:NOASSERTIONStargazers:34Issues:0Issues:0
Language:JavaScriptStargazers:1Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-4-ClauseStargazers:9952Issues:0Issues:0

languagecodec_tmp

Temporary anonymous version

Language:PythonLicense:Apache-2.0Stargazers:23Issues:0Issues:0

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonLicense:NOASSERTIONStargazers:2536Issues:0Issues:0

LRV-Instruction

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Language:PythonLicense:BSD-3-ClauseStargazers:233Issues:0Issues:0

fcn.berkeleyvision.org

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR 2015 and PAMI 2016.

Language:PythonStargazers:3295Issues:0Issues:0

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2580Issues:0Issues:0

stable-audio-tools

Generative models for conditional audio generation

Language:PythonLicense:MITStargazers:2251Issues:0Issues:0

Catch-A-Waveform

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

Language:PythonLicense:NOASSERTIONStargazers:181Issues:0Issues:0

awesome-vision-language-pretraining-papers

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

Stargazers:1131Issues:0Issues:0

audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Language:PythonLicense:MITStargazers:350Issues:0Issues:0

Fast3DScattering-release

Repo for our research paper "Learning Acoustic Scattering Fields for Dynamic Interactive Sound Propagation"

Language:MATLABStargazers:14Issues:0Issues:0

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Language:PythonLicense:NOASSERTIONStargazers:2601Issues:0Issues:0

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

License:NOASSERTIONStargazers:1Issues:0Issues:0
Language:JavaScriptStargazers:1Issues:0Issues:0
Language:SCSSStargazers:1Issues:0Issues:0
Stargazers:1Issues:0Issues:0
Language:JavaScriptStargazers:1Issues:0Issues:0
Language:SCSSStargazers:1Issues:0Issues:0