Varun Ganjigunte Prakash (Varun-GP)

Varun-GP

Geek Repo

Company:Frontera Health Inc.

Location:Bengaluru

Home Page:varun-gp.github.io

Github PK Tool:Github PK Tool

Varun Ganjigunte Prakash's starred repositories

MMA-DFER

This repository provides an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.

Language:PythonStargazers:7Issues:0Issues:0

2024-ICLR-Norton

Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]

Language:PythonLicense:Apache-2.0Stargazers:105Issues:0Issues:0

TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:258Issues:0Issues:0

Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

Stargazers:335Issues:0Issues:0

MyVLM

Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)

Language:PythonLicense:NOASSERTIONStargazers:136Issues:0Issues:0

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:1891Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:37Issues:0Issues:0

EmotionCLIP

[CVPR 2023] Code for "Learning Emotion Representations from Verbal and Nonverbal Communication"

Language:PythonLicense:MITStargazers:33Issues:0Issues:0

PLLaVA

Official repository for the paper PLLaVA

Language:PythonStargazers:533Issues:0Issues:0

furuta_pendulum

LQR, MPC and DRL approaches to control the Furuta pendulum.

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:36Issues:0Issues:0

roomac_ros

ROS packages for roomac autonomous mobile manipulation robot

Language:PythonLicense:GPL-3.0Stargazers:31Issues:0Issues:0

T3AL

Official Pytorch implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024

Language:PythonStargazers:38Issues:0Issues:0
Language:PythonStargazers:88Issues:0Issues:0

MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Language:PythonLicense:MITStargazers:208Issues:0Issues:0

PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.

Language:PythonLicense:BSD-3-ClauseStargazers:3092Issues:0Issues:0
Language:PythonStargazers:238Issues:0Issues:0

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:29691Issues:0Issues:0

Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Language:PythonLicense:Apache-2.0Stargazers:525Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:123Issues:0Issues:0

laughter

Learning embeddings for laughter categorization

Language:PythonStargazers:34Issues:0Issues:0

portaudio

PortAudio is a cross-platform, open-source C language library for real-time audio input and output.

Language:CLicense:NOASSERTIONStargazers:1418Issues:0Issues:0

ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Language:PythonLicense:Apache-2.0Stargazers:920Issues:0Issues:0

Real-Time-Sound-Event-Detection

This repository contains the python implementation of a Sound Event Detection systems working in real time.

Language:PythonStargazers:41Issues:0Issues:0

PromptingWhisper

Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation

Language:PythonStargazers:131Issues:0Issues:0
Language:PythonLicense:MITStargazers:199Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:25777Issues:0Issues:0

Caption-Anything

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

Language:PythonLicense:BSD-3-ClauseStargazers:1655Issues:0Issues:0

MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Language:PythonLicense:Apache-2.0Stargazers:11259Issues:0Issues:0

PCA-EVAL

[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain

Language:Jupyter NotebookStargazers:98Issues:0Issues:0

Awesome_Multimodel_LLM

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.

Stargazers:239Issues:0Issues:0