vespa demo

vespa HA running in a local kubernetes cluster.

Getting started

Prerequisites:

k3d (for creating a local kubernetes cluster)
kubectl
macOS: brew

Install vespsa cli, create k3d cluster and deploy vespa:

make install

Deploy album recommendation app and feed data:

make deploy-app feed

Usage

Basic query:

make query

Endpoints:

TODO

Failover

TODO

Resources

Config server:

configmap/vespa-config
service/vespa-internal
statefulset.apps/vespa-configserver

Vespa:

service/vespa-feed
service/vespa-query
statefulset.apps/vespa-admin
statefulset.apps/vespa-feed-container
statefulset.apps/vespa-query-container
statefulset.apps/vespa-content

The configserver and vespa-content statefulsets have volume claims.

Architecture

See Vespa Overview

Admin services

Config servers

Vespa Configuration Servers host the endpoint where application packages are deployed and serve generated configuration to all services. Without Config Servers, Vespa cannot be configured, and services cannot run. The config servers hold the node configuration which determines which services will run on which nodes.

They use embedded Apache Zookeeper for data storage. Config Servers must be started first before other Vespa nodes, as the other nodes depend on Config Servers at startup.

Cluster controllers

Maintains the state of the nodes in the content cluster in order to provide elasticity and failure detection. The cluster state is generating by polling nodes for their unit state (eg: up, down, or stopping) and merging that with and user-provided state that marks nodes as up, down, under maintenance or retired.

Admin server

The default admin node. This isn't its own process, just the default node for various administrative services like log server, configuration server (configserver), and slobrok unless you specify otherwise in your configuration.

Service location brokers (slobrok)

Clients and the cluster controller use a slobrok to locate services.

Container cluster

Stateless Java services that receive and process incoming data (feed) and/or queries from clients, before passing them to the content cluster. Can be configured as a single cluster for all types of requests, or as multiple clusters.

A container service is an hosting environment for components of different types.

Includes the following for documents:

And the following for querying:

Searcher for the query endpoint
Huggingface Embbedder to run ONNX embedding models
LLM Clients

Content cluster

Responsible for storing data and execute queries and inferences over the data. Distributors automatically rebalance documents to maintain a balanced distribution at the configured redundancy level, including when nodes fail. See Content Cluster Elasticity.

Application packages

maps services to nodes in services.xml
defines aliases for nodes in hosts.xml
defines document schemas
contains machine-learned models and Java components

The application package is deployed to any node in the config cluster.

See

Consistency

AP see Vespa Consistency Model

Vespa has support for conditional writes for individual documents through test-and-set operations. Multi-document transactions are not supported.

After a successful response, changes to the search indexes are immediately visible by default.

References

Troubleshooting

See Troubleshooting.

TODO

Add basic auth

About

vespa HA running in a local kubernetes cluster

MIT License

Languages

Language:Makefile 69.9%Language:Shell 29.9%Language:Python 0.1%