getzep / zep

Zep: Long-Term Memory for ‍AI Assistants.

Home Page:https://docs.getzep.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SummaryExtractor extract failed

yuvraj123-verma opened this issue · comments

I want to use custom llm model (TheBloke/Airoboros-L2-70B-GPT4-m2.0-AWQ) for Summary Extraction and Intent Extraction instead of using openai llm model.

I am using below modifyed file to run custom llm model. and i am getting an error
Error:- level=error msg="SummaryExtractor extract failed: extractor error: SummaryExtractor summarize failed (original error: API returned unexpected status code: 404: )"
please provide solution on this issue.

######### Config.yaml #########

llm:
  # openai or anthropic
  service: "openai"
  # OpenAI: gpt-3.5-turbo, gpt-4, gpt-3.5-turbo-16k, gpt-4-32k; Anthropic: claude-instant-1 or claude-2
  model: "TheBloke/Airoboros-L2-70B-GPT4-m2.0-AWQ"
  ## OpenAI-specific settings
  # Only used for Azure OpenAI API
  azure_openai_endpoint:
  # for Azure OpenAI API deployment, the model may be deployed with custom deployment names
  # set the deployment names if you encounter in logs HTTP 404 errors:
  #   "The API deployment for this resource does not exist."
  azure_openai:
  # llm.model name is used as deployment name as reasonable default if not set
  # assuming base model is deployed with deployment name matching model name
  llm_deployment: "gpt-3.5-turbo-TheBloke/Airoboros-L2-70B-GPT4-m2.0-AWQ"
  # embeddings deployment is required when Zep is configured to use OpenAI embeddings
  #   embedding_deployment: "text-embedding-ada-002-customname"
  # Use only with an alternate OpenAI-compatible API endpoint
  openai_endpoint: "https://wm4o3bvyqlaf5c-5000.proxy.runpod.net/v1"
  # openai_endpoint:
  openai_org_id:
nlp:
  server_url: "http://localhost:5557"
memory:
  message_window: 2

extractors:
  documents:
    embeddings:
      enabled: true
      max_procs: 2
      chunk_size: 1000
      buffer_size: 1000
      dimensions: 384
      service: "local"
  #      dimensions: 1536
  # service: "openai"
  messages:
    summarizer:
      enabled: true
    entities:
      enabled: true
    intent:
      enabled: true
    embeddings:
      enabled: true
      dimensions: 384
      service: "local"
#      dimensions: 1536
#  service: "openai"

store:
  type: "postgres"
  postgres:
    dsn: "postgres://postgres:postgres@localhost:5432/?sslmode=disable"
server:
  # Specify the host to listen on. Defaults to 0.0.0.0
  host: 0.0.0.0
  port: 8000
  # Is the Web UI enabled?
  # Warning: The Web UI is not secured by authentication and should not be enabled if
  # Zep is exposed to the public internet.
  web_enabled: true
  # The maximum size of a request body, in bytes. Defaults to 5MB.
  max_request_size: 5242880
auth:
  # Set to true to enable authentication
  required: false
  # Do not use this secret in production. The ZEP_AUTH_SECRET environment variable should be
  # set to a cryptographically secure secret. See the Zep docs for details.
  secret: "do-not-use-this-secret-in-production"
data:
  #  PurgeEvery is the period between hard deletes, in minutes.
  #  If set to 0 or undefined, hard deletes will not be performed.
  purge_every: 60
log:
  level: "info"
# Custom Prompts Configuration
# Allows customization of extractor prompts.
custom_prompts:
  summarizer_prompts:
    # Anthropic Guidelines:
    # - Use XML-style tags like <current_summary> as element identifiers.
    # - Include {{.PrevSummary}} and {{.MessagesJoined}} as template variables.
    # - Clearly explain model instructions, e.g., "Review content inside <current_summary></current_summary> tags".
    # - Provide a clear example within the prompt.
    #
    # Example format:
    # anthropic: |
    #   <YOUR INSTRUCTIONS HERE>
    #   <example>
    #     <PROVIDE AN EXAMPLE>
    #   </example>
    #   <current_summary>{{.PrevSummary}}</current_summary>
    #   <new_lines>{{.MessagesJoined}}</new_lines>
    #   Response without preamble.
    #
    # If left empty, the default Anthropic summary prompt from zep/pkg/extractors/prompts.go will be used.
    anthropic: |

    # OpenAI summarizer prompt configuration.
    # Guidelines:
    # - Include {{.PrevSummary}} and {{.MessagesJoined}} as template variables.
    # - Provide a clear example within the prompt.
    #
    # Example format:
    # openai: |
    #   <YOUR INSTRUCTIONS HERE>
    #   Example:
    #     <PROVIDE AN EXAMPLE>
    #   Current summary: {{.PrevSummary}}
    #   New lines of conversation: {{.MessagesJoined}}
    #   New summary:`
    #
    # If left empty, the default OpenAI summary prompt from zep/pkg/extractors/prompts.go will be used.
    openai: |
      `Review the Current Content, if there is one, and the New Lines of the provided conversation. Create a concise summary 
      of the conversation, adding from the New Lines to the Current summary.
      If the New Lines are meaningless, return the Current Content.
      EXAMPLE
      Current summary:
      The human inquires about Led Zeppelin's lead singer and other band members. The AI identifies Robert Plant as the 
      lead singer.
      New lines of conversation:
      Human: Who were the other members of Led Zeppelin?
      AI: The other founding members of Led Zeppelin were Jimmy Page (guitar), John Paul Jones (bass, keyboards), and 
      John Bonham (drums).
      New summary:
      The human inquires about Led Zeppelin's lead singer and other band members. The AI identifies Robert Plant as the lead
      singer and lists the founding members as Jimmy Page, John Paul Jones, and John Bonham.
      EXAMPLE END
      Current summary:
      {{.PrevSummary}}
      New lines of conversation:
      {{.MessagesJoined}}
      New summary:`

######### docker-compose.yaml #########

version: "3.7"
services:
  db:
    image: ghcr.io/getzep/postgres:latest
    container_name: zep-postgres
    restart: on-failure
    shm_size: "128mb" # Increase this if vacuuming fails with a "no space left on device" error
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
    build:
      context: .
      dockerfile: Dockerfile.postgres
    networks:
      - zep-network
    volumes:
      - zep-db:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-q", "-d", "postgres", "-U", "postgres"]
      interval: 5s
      timeout: 5s
      retries: 5
    ports:
      - "5432:5432"
  nlp:
    image: ghcr.io/getzep/zep-nlp-server:latest
    container_name: zep-nlp
    env_file:
      - .env # You can set your embedding-related variables here
    restart: on-failure
    networks:
      - zep-network
    healthcheck:
      test: timeout 10s bash -c ':> /dev/tcp/127.0.0.1/5557' || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 45s
    ports:
      - "5557:5557"
  zep:
    image: ghcr.io/getzep/zep:latest
    container_name: zep
    restart: on-failure
    depends_on:
      db:
        condition: service_healthy
      nlp:
        condition: service_healthy
    ports:
      - "8000:8000"
    volumes:
      - ./config.yaml:/app/config.yaml
    environment:
      - ZEP_STORE_POSTGRES_DSN=postgres://postgres:postgres@db:5432/postgres?sslmode=disable
      - ZEP_NLP_SERVER_URL=http://nlp:5557
      - ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_SERVICE=local
      - ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_DIMENSIONS=384
      - ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_SERVICE=local
      - ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_DIMENSIONS=384

    env_file:
      - .env # Store your OpenAI API key here as ZEP_OPENAI_API_KEY
    build:
      context: .
      dockerfile: Dockerfile
    healthcheck:
      test: timeout 10s bash -c ':> /dev/tcp/127.0.0.1/8000' || exit 1
      interval: 5s
      timeout: 10s
      retries: 3
      start_period: 40s
    networks:
      - zep-network
networks:
  zep-network:
    driver: bridge
volumes:
  zep-db:

####### .env ##########

ZEP_OPENAI_API_KEY='sk-****'
ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_SERVICE=local
ZEP_EXTRACTORS_DOCUMENTS_EMBEDDINGS_DIMENSIONS=384
ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_SERVICE=local
ZEP_EXTRACTORS_MESSAGES_EMBEDDINGS_DIMENSIONS=384
ZEP_EMBEDDINGS_MESSAGES_ENABLED=true
ZEP_EMBEDDINGS_MESSAGES_MODEL="thenlper/gte-small"
ZEP_EMBEDDINGS_DOCUMENTS_ENABLED=true
ZEP_EMBEDDINGS_DOCUMENTS_MODEL="thenlper/gte-small"

Does runpod offer an OpenAI-compatible API? The 404 error below indicates that the API endpoint Zep expects is not available.

SummaryExtractor summarize failed (original error: API returned unexpected status code: 404: