marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inputting embeddings directly

liechtym opened this issue · comments

I have a use case where I need to modify input embeddings before they

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-7b-Chat-GGUF", model_file="llama-2-7b-chat.Q4_K_M.gguf", model_type="llama", gpu_layers=50)

embeddings = llm.embed('Some text')

# Code to modify embeddings
modified_embeddings = some_function(embeddings)

res = llm.generate(inputs_embeds=modified_embeddings)

Is there any way to do this?