SummaryExtractor extract failed
yuvraj123-verma opened this issue · comments
I want to use custom llm model (TheBloke/Airoboros-L2-70B-GPT4-m2.0-AWQ) for Summary Extraction and Intent Extraction instead of using openai llm model.
I am using below modifyed file to run custom llm model. and i am getting an error
Error:- level=error msg="SummaryExtractor extract failed: extractor error: SummaryExtractor summarize failed (original error: API returned unexpected status code: 404: )"
please provide solution on this issue.
######### Config.yaml #########
llm:
# openai or anthropic
service: "openai"
# OpenAI: gpt-3.5-turbo, gpt-4, gpt-3.5-turbo-16k, gpt-4-32k; Anthropic: claude-instant-1 or claude-2
model: "TheBloke/Airoboros-L2-70B-GPT4-m2.0-AWQ"
## OpenAI-specific settings
# Only used for Azure OpenAI API
azure_openai_endpoint:
# for Azure OpenAI API deployment, the model may be deployed with custom deployment names
# set the deployment names if you encounter in logs HTTP 404 errors:
# "The API deployment for this resource does not exist."
azure_openai:
# llm.model name is used as deployment name as reasonable default if not set
# assuming base model is deployed with deployment name matching model name
llm_deployment: "gpt-3.5-turbo-TheBloke/Airoboros-L2-70B-GPT4-m2.0-AWQ"
# embeddings deployment is required when Zep is configured to use OpenAI embeddings
# embedding_deployment: "text-embedding-ada-002-customname"
# Use only with an alternate OpenAI-compatible API endpoint
openai_endpoint: "https://wm4o3bvyqlaf5c-5000.proxy.runpod.net/v1"
# openai_endpoint:
openai_org_id:
nlp:
server_url: "http://localhost:5557"
memory:
message_window: 2
extractors:
documents:
embeddings:
enabled: true
max_procs: 2
chunk_size: 1000
buffer_size: 1000
dimensions: 384
service: "local"
# dimensions: 1536
# service: "openai"
messages:
summarizer:
enabled: true
entities:
enabled: true
intent:
enabled: true
embeddings:
enabled: true
dimensions: 384
service: "local"
# dimensions: 1536
# service: "openai"
store:
type: "postgres"
postgres:
dsn: "postgres://postgres:postgres@localhost:5432/?sslmode=disable"
server:
# Specify the host to listen on. Defaults to 0.0.0.0
host: 0.0.0.0
port: 8000
# Is the Web UI enabled?
# Warning: The Web UI is not secured by authentication and should not be enabled if
# Zep is exposed to the public internet.
web_enabled: true
# The maximum size of a request body, in bytes. Defaults to 5MB.
max_request_size: 5242880
auth:
# Set to true to enable authentication
required: false
# Do not use this secret in production. The ZEP_AUTH_SECRET environment variable should be
# set to a cryptographically secure secret. See the Zep docs for details.
secret: "do-not-use-this-secret-in-production"
data:
# PurgeEvery is the period between hard deletes, in minutes.
# If set to 0 or undefined, hard deletes will not be performed.
purge_every: 60
log:
level: "info"
# Custom Prompts Configuration
# Allows customization of extractor prompts.
custom_prompts:
summarizer_prompts:
# Anthropic Guidelines:
# - Use XML-style tags like <current_summary> as element identifiers.
# - Include {{.PrevSummary}} and {{.MessagesJoined}} as template variables.
# - Clearly explain model instructions, e.g., "Review content inside <current_summary></current_summary> tags".
# - Provide a clear example within the prompt.
#
# Example format:
# anthropic: |
# <YOUR INSTRUCTIONS HERE>
# <example>
# <PROVIDE AN EXAMPLE>
# </example>
# <current_summary>{{.PrevSummary}}</current_summary>
# <new_lines>{{.MessagesJoined}}</new_lines>
# Response without preamble.
#
# If left empty, the default Anthropic summary prompt from zep/pkg/extractors/prompts.go will be used.
anthropic: |
# OpenAI summarizer prompt configuration.
# Guidelines:
# - Include {{.PrevSummary}} and {{.MessagesJoined}} as template variables.
# - Provide a clear example within the prompt.
#
# Example format:
# openai: |
# <YOUR INSTRUCTIONS HERE>
# Example:
# <PROVIDE AN EXAMPLE>
# Current summary: {{.PrevSummary}}
# New lines of conversation: {{.MessagesJoined}}
# New summary:`
#
# If left empty, the default OpenAI summary prompt from zep/pkg/extractors/prompts.go will be used.
openai: |
`Review the Current Content, if there is one, and the New Lines of the provided conversation. Create a concise summary
of the conversation, adding from the New Lines to the Current summary.
If the New Lines are meaningless, return the Current Content.
EXAMPLE
Current summary:
The human inquires about Led Zeppelin's lead singer and other band members. The AI identifies Robert Plant as the
lead singer.
New lines of conversation:
Human: Who were the other members of Led Zeppelin?
AI: The other founding members of Led Zeppelin were Jimmy Page (guitar), John Paul Jones (bass, keyboards), and
John Bonham (drums).
New summary:
The human inquires about Led Zeppelin's lead singer and other band members. The AI identifies Robert Plant as the lead
singer and lists the founding members as Jimmy Page, John Paul Jones, and John Bonham.
EXAMPLE END
Current summary:
{{.PrevSummary}}
New lines of conversation:
{{.MessagesJoined}}
New summary:`
######### docker-compose.yaml #########
version: "3.7"
services:
db:
image: ghcr.io/getzep/postgres:latest
container_name: zep-postgres
restart: on-failure
shm_size: "128mb" # Increase this if vacuuming fails with a "no space left on device" error
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
build:
context: .
dockerfile: Dockerfile.postgres
networks:
- zep-network
volumes:
- zep-db:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-q", "-d", "postgres", "-U", "postgres"]
interval: 5s
timeout: 5s
retries: 5
ports:
- "5432:5432"
nlp:
image: ghcr.io/getzep/zep-nlp-server:latest
container_name: zep-nlp
env_file:
- .env # You can set your embedding-related variables here
restart: on-failure
networks:
- zep-network
healthcheck:
test: timeout 10s bash -c ':> /dev/tcp/127.0.0.1/5557' || exit 1
interval: 10s
timeout: 5s
retries: 5
start_period: 45s
ports:
- "5557:5557"
zep:
image: ghcr.io/getzep/zep:latest
container_name: zep
restart: on-failure
depends_on:
db:
condition: service_healthy
nlp:
condition: service_healthy
ports:
- "8000:8000"
volumes:
- ./config.yaml:/app/config.yaml
environment:
- ZEP_STORE_POSTGRES_DSN=postgres://postgres:postgres@db:5432/postgres?sslmode=disable
- ZEP_NLP_SERVER_URL=http://nlp:5557
- ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_SERVICE=local
- ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_DIMENSIONS=384
- ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_SERVICE=local
- ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_DIMENSIONS=384
env_file:
- .env # Store your OpenAI API key here as ZEP_OPENAI_API_KEY
build:
context: .
dockerfile: Dockerfile
healthcheck:
test: timeout 10s bash -c ':> /dev/tcp/127.0.0.1/8000' || exit 1
interval: 5s
timeout: 10s
retries: 3
start_period: 40s
networks:
- zep-network
networks:
zep-network:
driver: bridge
volumes:
zep-db:
####### .env ##########
ZEP_OPENAI_API_KEY='sk-****'
ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_SERVICE=local
ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_DIMENSIONS=384
ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_SERVICE=local
ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_DIMENSIONS=384
ZEP_EMBEDDINGS_MESSAGES_ENABLED=true
ZEP_EMBEDDINGS_MESSAGES_MODEL="thenlper/gte-small"
ZEP_EMBEDDINGS_DOCUMENTS_ENABLED=true
ZEP_EMBEDDINGS_DOCUMENTS_MODEL="thenlper/gte-small"
Does runpod offer an OpenAI-compatible API? The 404 error below indicates that the API endpoint Zep expects is not available.
SummaryExtractor summarize failed (original error: API returned unexpected status code: 404: