colorful's starred repositories

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1767Issues:0Issues:0

deep-seek

LLM powered retrieval engine designed to process a ton of sources to collect a comprehensive list of entities.

Language:TypeScriptLicense:MITStargazers:290Issues:0Issues:0

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonLicense:BSD-3-ClauseStargazers:25028Issues:0Issues:0

imp

a family of highly capabale yet efficient large multimodal models

Language:PythonLicense:Apache-2.0Stargazers:142Issues:0Issues:0

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:707Issues:0Issues:0

FuseAI

FuseLLM & FuseChat Project

Language:PythonStargazers:348Issues:0Issues:0
Language:PythonLicense:MITStargazers:47Issues:0Issues:0

MemSAM

[CVPR 2024 Oral] MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation.

Language:PythonLicense:MITStargazers:44Issues:0Issues:0

mllm

Fast Multimodal LLM on Mobile Devices

Language:C++License:MITStargazers:166Issues:0Issues:0

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Language:PythonStargazers:396Issues:0Issues:0

SwiftSage

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

Language:PythonStargazers:227Issues:0Issues:0

LLaVA-UHD

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Language:PythonStargazers:216Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:4428Issues:0Issues:0

MADTP

MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer

License:Apache-2.0Stargazers:14Issues:0Issues:0
Language:PythonStargazers:12Issues:0Issues:0

PSALM

This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Language:PythonLicense:Apache-2.0Stargazers:130Issues:0Issues:0

JiuTian

[CVPR 2024] JiuTian, a Multimodal Large Language Model from HITSZ

License:MITStargazers:102Issues:0Issues:0

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Language:PythonLicense:Apache-2.0Stargazers:1511Issues:0Issues:0

crg

PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"

Language:PythonLicense:MITStargazers:20Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:696Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:Jupyter NotebookLicense:MITStargazers:49328Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:7942Issues:0Issues:0

Agent-FLAN

[ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

License:Apache-2.0Stargazers:285Issues:0Issues:0

MiniSora-DiT

minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora

Language:PythonLicense:Apache-2.0Stargazers:31Issues:0Issues:0

OpenAI-CLIP-Feature

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

Language:PythonLicense:MITStargazers:91Issues:0Issues:0

MaskFormer

Per-Pixel Classification is Not All You Need for Semantic Segmentation (NeurIPS 2021, spotlight)

Language:PythonLicense:NOASSERTIONStargazers:1300Issues:0Issues:0

LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents

Language:PythonStargazers:272Issues:0Issues:0

Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Language:PythonLicense:Apache-2.0Stargazers:701Issues:0Issues:0

unify-parameter-efficient-tuning

Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning" (ICLR 2022)

Language:PythonLicense:Apache-2.0Stargazers:493Issues:0Issues:0