hanoonaR

Hanoona Rasheed's starred repositories

baple

[MICCAI 2024] Official code repository of paper titled "BAPLe: Backdoor Attacks on Medical Foundation Models using Prompt Learning" accepted in MICCAI 2024 conference.

Language:PythonMIT4300

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.0997600

GroupMamba

Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"

Language:PythonMIT5400

VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Language:PythonCC-BY-4.017700

LLaVA-NeXT

Language:PythonApache-2.0211800

corenet

CoreNet: A library for training deep neural networks

Language:PythonNOASSERTION689800

MobiLlama

MobiLlama : Small Language Model tailored for edge devices

Language:PythonApache-2.057700

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:Python78300

MAVOS

Efficient Video Object Segmentation via Modulated Cross-Attention Memory

BSD-3-Clause4500

ovsam

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Language:PythonNOASSERTION88700

Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Language:Python23300

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Language:Python73000

XM-GAN

[MICCAI 2023][Early Accept] Official code repository of paper titled "Cross-modulated Few-shot Image Generation for Colorectal Tissue Classification"

Language:Python4400

Awesome-CV-Foundational-Models

44500

GoogleBard-VisUnderstand

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

3000

vafa

[MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation" accepted in MICCAI 2023 conference.

Language:PythonMIT4700

XrayGPT

[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

Language:Python45500

ClimateGPT

[EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabic languages.

Language:Python7300

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonCC-BY-4.0113000

hanoonaR

Hanoona Rasheed's starred repositories

baple

segment-anything-2

GroupMamba

VideoGPT-plus

LLaVA-NeXT

corenet

MobiLlama

LLaVA-pp

MAVOS

ovsam

Video-LLaVA

groundingLMM

XM-GAN

Awesome-CV-Foundational-Models

GoogleBard-VisUnderstand

vafa

XrayGPT

ClimateGPT

Video-ChatGPT

XPretrain

CVPR-2023-Papers

SwiftFormer

Transformer-MM-Explainability

pointnet2

LanguageGroundedSemseg

DenseCLIP

CoOp

object-centric-ovd

PaLM-rlhf-pytorch

awesome-chatgpt-prompts