clip

There are 6 repositories under clip topic.

easychen / pushdeer
开放源码的无App推送服务，iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备
app push clip notification-service
Language:C 4438
marqo
marqo-ai / marqo
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
deep-learning information-retrieval machinelearning vector-search tensor-search clip multi-modal search-engine transformers vision-language machine-learning semantic-search visual-search natural-language-processing hnsw knn hacktoberfest chatgpt gpt large-language-models
Language:Python 4183
OFA-Sys / Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
chinese computer-vision multi-modal-learning nlp pytorch vision-and-language-pre-training image-text-retrieval clip pretrained-models vision-language deep-learning multi-modal contrastive-loss transformers coreml-models
Language:Python 3717
open-mmlab / mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
image-classification resnet mobilenet pytorch deep-learning swin-transformer beit clip constrastive-learning convnext mae masked-image-modeling moco pretrained-models self-supervised-learning vision-transformer multimodal
Language:Python 3197
CVHub520 / X-AnyLabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
labeling-tool paddle pytorch resnet sam yolo deep-learning deeplearning onnx clip llm
Language:Python 2593
pharmapsychotic / clip-interrogator
Image to prompt with BLIP and CLIP
clip pytorch
Language:Python 2511
yuanzhoulvpi2017 / zero_nlp
中文nlp解决方案(大模型、数据、模型、训练、推理)
bert nlp transformers gpt2 chatglm-6b clip gpt gpt-neox pytorch text-generation dolly bloom huggingface-transformers falcon llama2 pipeline
Language:Python 2511
rom1504 / clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
semantic-search deep-learning multimodal ai clip knn
Language:Jupyter Notebook 2163
RuffianZhong / RWidgetHelper
Android UI 快速开发，专治原生控件各种不服
circle circleimageview clip cliplayout corner drawableleft drawablewithtext gradient imageview ripper ripperdrawable roundedimageview selector shadow shadowdrawable shadowlayout shadowview shape state textview
Language:Java 1811
jingyi0000 / VLM_survey
Collection of AWESOME vision-language models for vision tasks
computer-vision deep-learning knowledge-distillation survey transfer-learning vision-language-model clip multi-modal-model
1787
roboflow / awesome-openai-vision-api-experiments
Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
chatgpt computer-vision openai classification clip zero-shot grounding-dino open-vocabulary-detection open-vocabulary-segmentation segment-anything
Language:Python 1582
hcaptcha-challenger
QIN2DIM / hcaptcha-challenger
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
yolov5 hcaptcha opencv-python onnx-models hcaptcha-solver solver onnx yolo onnxruntime playwright clip multi-modal zero-shot-classification multi-modal-learning computer-vision image-segmentation object-detection
Language:Python 1422
yzhuoning / Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
clip contrastive-learning pre-training
1032
mbzuai-oryx / Video-ChatGPT
"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
chatbot clip gpt-4 llama llava mulit-modal vicuna vision-language vision-language-pretraining video-chatboat video-conversation
Language:Python 959
EdVince / Stable-Diffusion-NCNN
Stable Diffusion in NCNN with c++, supported txt2img and img2img
clip cpp diffusion mnn ncnn onnx stable-diffusion tensorrt tnn android executable img2img txt2img
Language:C++ 941
natural-language-image-search
haltakov / natural-language-image-search
Search photos on Unsplash using natural language
unsplash clip machine-learning computer-vision image-search photos
Language:Jupyter Notebook 935
uform
unum-cloud / uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
huggingface-transformers language-vision multimodal pytorch semantic-search transformer cross-attention vector-search bert neural-network pretrained-models multi-lingual clip openai openclip contrastive-learning representation-learning clustering image-search llava
Language:Python 904
natural-language-youtube-search
haltakov / natural-language-youtube-search
Search inside YouTube videos using natural language
machine-learning computer-vision search youtube clip
Language:Jupyter Notebook 900
omerbt / Text2LIVE
Official Pytorch Implementation for "Text2LIVE: Text-Driven Layered Image and Video Editing" (ECCV 2022 Oral)
eccv2022 image-editing text2live clip generative-model image-manipulation video-editing text-driven-editing single-image single-video
Language:Python 870
ArrowLuo / CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
video-text-retrieval multimodal-learning multimodality multimodal search ranking retrieval-model retrieval msrvtt lsmdc msvd activitynet didemo video-clip-retrieval clip
Language:Python 784
aphantasia
eps696 / aphantasia
CLIP + FFT/DWT/RGB = text to image/video
text-to-image clip text-to-video
Language:Python 771
hila-chefer / Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
transformers transformer vqa detr visualization explainability explainable-ai interpretability lxmert visualbert clip
Language:Jupyter Notebook 714
SkyWorkAIGC / SkyPaint-AI-Diffusion
基于Stable Diffusion优化的AI绘画模型。支持输入中英文文本，可生成多种现代艺术风格的高质量图像。| An optimized text-to-image model based on Stable Diffusion. Both Chinese and English text inputs are available to generate images. The model can generate high-quality images in several modern art styles.
dreambooth machine-learning text-to-image bert clip cv latent-diffusion openai pytorch ai-painting generative-art aigc artificial-intelligence diffusion stable-diffusion text2image dalle2 midjourney
656
Sense-GVT / DeCLIP
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
big-model clip image-text multi-model self-supervised vision-language-pretraining zero-shot
Language:Python 606
pablosichert / react-truncate
React component for truncating multi-line spans and adding an ellipsis.
react truncate ellipsis clip
Language:JavaScript 579
leondgarse / keras_cv_attention_models
Keras beit,caformer,CMT,CoAtNet,convnext,davit,dino,efficientdet,edgenext,efficientformer,efficientnet,eva,fasternet,fastervit,fastvit,flexivit,gcvit,ghostnet,gpvit,hornet,hiera,iformer,inceptionnext,lcnet,levit,maxvit,mobilevit,moganet,nat,nfnets,pvt,swin,tinynet,tinyvit,uniformer,volo,vanillanet,yolor,yolov7,yolov8,yolox,gpt2,llama2, alias kecam
tensorflow visualizing keras attention model imagenet coco recognition detection tf tf2 clip stable-diffusion segment-anything ddpm
Language:Python 561
openscene
pengsongyou / openscene
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
3d-scene-understanding clip semantic-segmentation llm cvpr2023 point-cloud-segmentation point-clouds scannet matterport3d nuscenes
Language:Python 553
SkalskiP / awesome-foundation-and-multimodal-models
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
blip clip foundational-models grounding-dino llava multimodal segment-anything computer-vision nlp open-vocabulary-detection open-vocabulary-segmentation zero-shot-detection image-captioning
Language:Python 514
keshiim / ZMJImageEditor
ZMJImageEditor is a picture editing component like WeChat. It is powerful and easy to integrate, supporting rendering, text, rotation, tailoring, mapping and other functions. (ZMJImageEditor 是一个和微信一样图片编辑的组件，功能强大，极易集成，支持绘制、文字、旋转、剪裁、贴图等功能)
image editor wechat image-editor editor-helper imageeditor rotation clip draw testing text text-editor
Language:Objective-C 497
open-compass / VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks
gpt-4v large-language-models llava multi-modal openai vqa llm openai-api qwen gpt computer-vision pytorch gpt4 chatgpt clip vit evaluation claude gemini
Language:Python 445
v-iashin / video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, ResNet features.
pytorch multi-gpu feature-extraction parallel video-features visual-features audio-features i3d vggish r2plus1d resnet raft optical-flow ig65m clip s3d laion swin timm vit
Language:Python 444
cliport / cliport
CLIPort: What and Where Pathways for Robotic Manipulation
clip robotics vision deep-learning natural-language-processing grounding vision-language manipulation pytorch rearrangement computer-vision
Language:Jupyter Notebook 422
monatis / clip.cpp
CLIP inference in plain C/C++ with no extra dependencies
c clip cpp ggml image-search multimodal
Language:C 394
Chrisvin / EasyReveal
Android Easy Reveal Library
android easy reveal easyreveal reveal-animations clip library android-library
Language:Kotlin 353
yangjianxin1 / CLIP-Chinese
中文CLIP预训练模型
clip chinese
Language:Python 344
zcf0508 / autocut-client
AutoCut Client
autocut clip electron video vue
Language:TypeScript 312

clip

easychen / pushdeer

marqo-ai / marqo

OFA-Sys / Chinese-CLIP

open-mmlab / mmpretrain

CVHub520 / X-AnyLabeling

pharmapsychotic / clip-interrogator

yuanzhoulvpi2017 / zero_nlp

rom1504 / clip-retrieval

RuffianZhong / RWidgetHelper

jingyi0000 / VLM_survey

roboflow / awesome-openai-vision-api-experiments

QIN2DIM / hcaptcha-challenger

yzhuoning / Awesome-CLIP

mbzuai-oryx / Video-ChatGPT

EdVince / Stable-Diffusion-NCNN

haltakov / natural-language-image-search

unum-cloud / uform

haltakov / natural-language-youtube-search

omerbt / Text2LIVE

ArrowLuo / CLIP4Clip

eps696 / aphantasia

hila-chefer / Transformer-MM-Explainability

SkyWorkAIGC / SkyPaint-AI-Diffusion

Sense-GVT / DeCLIP

pablosichert / react-truncate

leondgarse / keras_cv_attention_models

pengsongyou / openscene

SkalskiP / awesome-foundation-and-multimodal-models

keshiim / ZMJImageEditor

open-compass / VLMEvalKit

v-iashin / video_features

cliport / cliport

monatis / clip.cpp

Chrisvin / EasyReveal

yangjianxin1 / CLIP-Chinese

zcf0508 / autocut-client