vespa HA running in a local kubernetes cluster.
Prerequisites:
- k3d (for creating a local kubernetes cluster)
- kubectl
- macOS: brew
Install vespsa cli, create k3d cluster and deploy vespa:
make install
Deploy album recommendation app and feed data:
make deploy-app feed
Basic query:
make query
Endpoints:
- TODO
TODO
Config server:
- configmap/vespa-config
- service/vespa-internal
- statefulset.apps/vespa-configserver
Vespa:
- service/vespa-feed
- service/vespa-query
- statefulset.apps/vespa-admin
- statefulset.apps/vespa-feed-container
- statefulset.apps/vespa-query-container
- statefulset.apps/vespa-content
The configserver and vespa-content statefulsets have volume claims.
See Vespa Overview
Vespa Configuration Servers host the endpoint where application packages are deployed and serve generated configuration to all services. Without Config Servers, Vespa cannot be configured, and services cannot run. The config servers hold the node configuration which determines which services will run on which nodes.
They use embedded Apache Zookeeper for data storage. Config Servers must be started first before other Vespa nodes, as the other nodes depend on Config Servers at startup.
Maintains the state of the nodes in the content cluster in order to provide elasticity and failure detection. The cluster state is generating by polling nodes for their unit state (eg: up, down, or stopping) and merging that with and user-provided state that marks nodes as up, down, under maintenance or retired.
The default admin node. This isn't its own process, just the default node for various administrative services like log server, configuration server (configserver), and slobrok unless you specify otherwise in your configuration.
Clients and the cluster controller use a slobrok to locate services.
Stateless Java services that receive and process incoming data (feed) and/or queries from clients, before passing them to the content cluster. Can be configured as a single cluster for all types of requests, or as multiple clusters.
A container service is an hosting environment for components of different types.
Includes the following for documents:
And the following for querying:
- Searcher for the query endpoint
- Huggingface Embbedder to run ONNX embedding models
- LLM Clients
Responsible for storing data and execute queries and inferences over the data. Distributors automatically rebalance documents to maintain a balanced distribution at the configured redundancy level, including when nodes fail. See Content Cluster Elasticity.
- maps services to nodes in services.xml
- defines aliases for nodes in hosts.xml
- defines document schemas
- contains machine-learned models and Java components
The application package is deployed to any node in the config cluster.
See
AP see Vespa Consistency Model
Vespa has support for conditional writes for individual documents through test-and-set operations. Multi-document transactions are not supported.
After a successful response, changes to the search indexes are immediately visible by default.
- vespa cli see also its source code
- Multinode systems
- Using Kubernetes with Vespa
- Multinode-HA sample application (GKE)
- Models hot swap
- Convergence
- Batch delete
- Modifying schemas
See Troubleshooting.
- Add basic auth