coura's starred repositories
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
invisible-watermark
python library for invisible image watermark (blind image watermark)
RegionCLIP
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
UniDetector
Code release for our CVPR 2023 paper "Detecting Everything in the Open World: Towards Universal Object Detection".
Chinese-LLaVA
支持中英文双语视觉-文本对话的开源可商用多模态模型。
mvits_for_class_agnostic_od
[ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".
object-centric-ovd
[NeurIPS 2022] Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection".
SimCSE-Chinese-Pytorch
SimCSE在中文上的复现,有监督+无监督
ONNX-ImageNet-1K-Object-Detector
Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX. The repository combines a class agnostic object localizer to first detect the objects in the image, and next a ResNet50 model trained on ImageNet is used to label each box.
BiomedCLIP-LoRA
Pytorch implementation of BiomedCLIP vision model with LoRA tuning
MoCo-v2-SupContrast
Supervised Contrastive Learning (SupContrast) based on MoCo-v2