mixiao-ai's starred repositories

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1603Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3109Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4425Issues:0Issues:0

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Language:PythonLicense:MITStargazers:1596Issues:0Issues:0

LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Language:PythonLicense:Apache-2.0Stargazers:654Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:8086Issues:0Issues:0

AnomalyGPT

[AAAI 2024 Oral] AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

Language:PythonLicense:NOASSERTIONStargazers:700Issues:0Issues:0

industrial-surface-inspection-datasets

Datasets for industrial surface-inspection

Stargazers:64Issues:0Issues:0

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonLicense:Apache-2.0Stargazers:1659Issues:0Issues:0

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:11522Issues:0Issues:0

autodistill

Images to inference with no labeling (use foundation models to train supervised models).

Language:PythonLicense:Apache-2.0Stargazers:1723Issues:0Issues:0

VLM_survey

Collection of AWESOME vision-language models for vision tasks

Stargazers:2048Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8881Issues:0Issues:0

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1904Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:12821Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9342Issues:0Issues:0
Language:PythonStargazers:784Issues:0Issues:0
Language:PythonStargazers:1371Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:4165Issues:0Issues:0

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:1063Issues:0Issues:0

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1495Issues:0Issues:0

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:2176Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4434Issues:0Issues:0

T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Language:PythonLicense:NOASSERTIONStargazers:2031Issues:0Issues:0

FastDeploy

⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.

Language:C++License:Apache-2.0Stargazers:2845Issues:0Issues:0

Personalize-SAM

Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds

Language:PythonLicense:MITStargazers:1475Issues:0Issues:0

lang-segment-anything

SAM with text prompt

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1414Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45767Issues:0Issues:0

TensorRT-For-YOLO-Series

tensorrt for yolo series (YOLOv10,YOLOv9,YOLOv8,YOLOv7,YOLOv6,YOLOX,YOLOv5), nms plugin support

Language:PythonStargazers:853Issues:0Issues:0

YOLOv8-TensorRT

YOLOv8 using TensorRT accelerate !

Language:C++License:MITStargazers:1239Issues:0Issues:0