slixurd/vespa

The big data serving engine - Store, search, rank and organize big data at user serving time. Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data and executes distributed queries including evaluation of machine-learned models over many data points in real time.

Travis-CI build status:

Background

Use cases such as search, recommendation and personalization need to select a subset of data in a large corpus, evaluate machine-learned models over the selected data, organize and aggregate it and return it, typically in less than 100 milliseconds, all while the data corpus is continuously changing.

This is hard to do, especially with large corpuses that needs to be distributed over multiple nodes and evaluated in parallel. Vespa is a platform which performs these operations for you. It has been in development for many years and is used on a number of large internet services and apps which serve hundreds of thousands of queries from Vespa per second.

Install

To get started using Vespa pick one of the quick start documents:

Usage

The application created in the quickstart is fully functional and production ready, but you may want to add more nodes for redundancy.
Try the Blog search and recommendation tutorial to learn more about using Vespa
See developing applications on adding your own Java components to your Vespa application.
Vespa APIs is useful to understand how to interface with Vespa
Explore the sample applications

Full documentation is available on https://docs.vespa.ai.

Contribute

We welcome contributions! See CONTRIBUTING.md to learn how to contribute.

If you want to contribute to the documentation, see https://github.com/vespa-engine/documentation

Building

You do not need to build Vespa to use it, but if you want to contribute you need to be able to build the code. This section explains how to build and test Vespa. To understand where to make changes, see Code-map.md. Some suggested improvements with pointers to code are in TODO.md.

Set up the build environment

C++ and Java building is supported on CentOS 7. The Java source can also be built on any platform having Java 11 and Maven installed. We recommend using the following environment: Create C++ / Java dev environment on CentOS using VirtualBox and Vagrant. You can also setup CentOS 7 natively and install the following build dependencies:

sudo yum-config-manager --add-repo https://copr.fedorainfracloud.org/coprs/g/vespa/vespa/repo/epel-7/group_vespa-vespa-epel-7.repo
sudo yum -y install epel-release centos-release-scl yum-utils
sudo yum -y install ccache \
    rpm-build
yum-builddep -y <vespa-source>/dist/vespa.spec

Build Java modules

export MAVEN_OPTS="-Xms128m -Xmx1024m"
source /opt/rh/rh-maven35/enable
bash bootstrap.sh java
mvn -T <num-threads> install

Build C++ modules

Replace <build-dir> with the name of the directory in which you'd like to build Vespa. Replace <source-dir> with the directory in which you've cloned/unpacked the source tree.

bash bootstrap-cpp.sh <source-dir> <build-dir>
cd <build-dir>
make -j <num-threads>
ctest3 -j <num-threads>

Create RPM packages

sh dist.sh VERSION && rpmbuild -ba ~/rpmbuild/SPECS/vespa-VERSION.spec

License

Code licensed under the Apache 2.0 license. See LICENSE for terms.

About

Vespa is an engine for low-latency computation over large data sets.

https://vespa.ai

Apache License 2.0

Languages

Language:Java 53.8%Language:C++ 43.6%Language:CMake 0.9%Language:Shell 0.4%Language:Perl 0.4%Language:HTML 0.3%Language:Emacs Lisp 0.2%Language:C 0.1%Language:Python 0.1%Language:CSS 0.1%Language:JavaScript 0.0%Language:ANTLR 0.0%Language:Roff 0.0%Language:Ruby 0.0%Language:Yacc 0.0%Language:Objective-C 0.0%Language:Perl 6 0.0%Language:LLVM 0.0%Language:PigLatin 0.0%Language:GAP 0.0%Language:Makefile 0.0%Language:Assembly 0.0%Language:Dockerfile 0.0%

slixurd / vespa

Table of contents