courao

coura's starred repositories

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Language:PythonApache-2.041033 393 1306

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.039097 360 1901

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookApache-2.016911 115 405

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

16264 277 138

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonNOASSERTION15752 133 621

FastSAM

Fast Segment Anything

Language:PythonAGPL-3.08056 54 215

Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Language:Jupyter NotebookMIT5508 36 348

MobileSAM

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

Language:Jupyter NotebookApache-2.05380 45 131

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Language:PythonMIT4153 32 279

VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Language:PythonApache-2.04150 42 358

FlagAI

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

Language:PythonApache-2.03864 41 211

Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Language:PythonMIT3227 80 164

CoOp

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

Language:PythonMIT1928 15 83

Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Language:Python1823 41 140

invisible-watermark

python library for invisible image watermark (blind image watermark)

Language:PythonMIT1776 14 33

dlrover

DLRover: An Automatic Distributed Deep Learning System

Language:PythonNOASSERTION1408 44 275

VisCPM

[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列

Language:Python1056 15 41

Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Language:PythonApache-2.0829 14 46

RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Language:PythonApache-2.0756 9 102

UniDetector

Code release for our CVPR 2023 paper "Detecting Everything in the Open World: Towards Universal Object Detection".

Language:PythonApache-2.0564 15 35

Chinese-LLaVA

支持中英文双语视觉-文本对话的开源可商用多模态模型。

Language:PythonApache-2.0370 5 9

mvits_for_class_agnostic_od

[ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".

Language:PythonMIT308 6 32

object-centric-ovd

[NeurIPS 2022] Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection".

Language:Jupyter NotebookApache-2.0290 5 23

SimCSE-Chinese-Pytorch

SimCSE在中文上的复现，有监督+无监督

Language:PythonMIT275 1 23

LLaVAR

Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"

Language:PythonApache-2.0269 6 21

good

[ICLR'23] GOOD: Exploring Geometric Cues for Detecting Objects in an Open World

Language:PythonMIT40 6 6

ONNX-ImageNet-1K-Object-Detector

Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX. The repository combines a class agnostic object localizer to first detect the objects in the image, and next a ResNet50 model trained on ImageNet is used to label each box.

Language:PythonMIT34 4 5

BiomedCLIP-LoRA

Pytorch implementation of BiomedCLIP vision model with LoRA tuning

Language:Python32 1 2

MoCo-v2-SupContrast

Supervised Contrastive Learning (SupContrast) based on MoCo-v2

Language:PythonNOASSERTION16 1 1