jina-ai / executor-text-clip-text-encoder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MOVED TO https://github.com/jina-ai/executors/tree/main/jinahub/encoders/text/TransformerTorchEncoder

✨ CLIPTextEncoder

CLIPTextEncoder is a class that wraps the text embedding functionality from the CLIP model.

The CLIP model was originally proposed in Learning Transferable Visual Models From Natural Language Supervision.

CLIPTextEncoder encodes data from a np.ndarray of strings and returns a np.ndarray of floating point values.

  • Input shape: BatchSize

  • Output shape: BatchSize x EmbeddingDimension

Table of Contents

🌱 Prerequisites

No prerequisites are required to run this executor.

πŸš€ Usages

🚚 Via JinaHub

Use the prebuilt images from JinaHub in your python codes,

from jina import Flow
	
f = Flow().add(
        uses='jinahub+docker://CLIPTextEncoder',
        volumes='/your_home_folder/.cache/clip:/root/.cache/clip'
	)

or in the .yml config.

jtype: Flow
pods:
  - name: encoder
    uses: 'jinahub+docker://CLIPTextEncoder'
    volumes: '/your_home_folder/.cache/clip:/root/.cache/clip'

πŸ“¦οΈ Via Pypi

  1. Install the jinahub-text-clip-text-encoder

    pip install git+https://github.com/jina-ai/executor-text-clip-text-encoder.git
  2. Use jinahub-text-clip-text-encoder in your code

    from jinahub.encoder.clip_text import CLIPTextEncoder
    from jina import Flow
    
    f = Flow().add(uses=CLIPTextEncoder)

🐳 Via Docker

  1. Clone the repo and build the docker image

    git clone https://github.com/jina-ai/executor-text-clip-text-encoder.git
    cd executor-text-CLIP
    docker build -t jinahub-clip-text .
  2. Use jinahub-clip-text in your code

    from jina import Flow
    
    f = Flow().add(
            uses='docker://jinahub-clip-text:latest',
            volumes='/your_home_folder/.cache/clip:/root/.cache/clip'
    	)

πŸŽ‰οΈ Example

from jina import Flow, Document
import numpy as np
	
f = Flow().add(
        uses='jinahub+docker://CLIPTextEncoder',
        volumes='/your_home_folder/.cache/clip:/root/.cache/clip'
	)
	
def check_emb(resp):
    for doc in resp.data.docs:
        if doc.emb:
            assert doc.emb.shape == (512,)
	
with f:
	f.post(
	    on='/foo', 
	    inputs=Document(text='your text'), 
	    on_done=check_emb
	)
	    

Inputs

Documents with the text attribute.

Returns

Documents with the embedding attribute filled with an ndarray of the shape 512 with dtype=float32.

πŸ”οΈ Reference

About

License:Apache License 2.0


Languages

Language:Python 88.2%Language:Shell 7.1%Language:Dockerfile 4.7%