CLIPEncoder

Encoder that embeds documents using either the CLIP vision encoder or the CLIP text encoder, depending on the content type of the document.

Overview

CLIPEncoder is an encoder that wraps the image and text embedding functionality of the CLIP model from huggingface transformers. The encoder embeds documents using either the text or the visual part of CLIP, depending on their content.

For more information on the gpu usage and volume mounting, please refer to the documentation. For more information on CLIP model, checkout the blog post, paper and hugging face documentation

Usage

via Docker image (recommended)

from jina import Flow
	
f = Flow().add(uses='jinahub+docker://CLIPEncoder')

via source code

from jina import Flow
	
f = Flow().add(uses='jinahub://CLIPEncoder')

To override __init__ args & kwargs, use .add(..., uses_with: {'key': 'value'})
To override class metas, use .add(..., uses_metas: {'key': 'value})

About

Encoder that embeds documents using either the CLIP vision encoder or the CLIP text encoder, depending on the content type of the document.

Languages

Language:Python 100.0%