DataFabricRus / scylla-rdf-benchmark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Benchmarks for Scylla-RDF

Ingestion: Google Dataflow 109M triples

The ingestion pipelines from scylla-beam-pipelines were used. The ingestion included the bulk loading RDF into all indexes in ScyllaDB, the full-text index wasn't considered here.

ScyllaDB had 2 nodes with the following characteristics:

  • n1-standard-16,
  • 2 * Local SSD with NVMe (375 Gb each)

The pipelines were run on Google Dataflow:

  • 20 machines (n1-standard-1),
  • 1 * connection to ScyllaDB per machine,
  • 8192 * parallel requests per a connection.

The ingestion pipelines run ~16 min and loaded 109,836,664 RDF triples which gives ~114k triples/sec.

Queries: WatDiv 109M triples

The queries are executed by the query-executor. Example command:

java -jar query-executor-1.0-SNAPSHOT-jar-with-dependencies.jar http://graph-worker-vis:3001/server/repositories/watdiv ./queries ./results

Dataset & Queries

  • dataset (~109M triples) at gs://scylla-rdf-benchmark/dataset.nt,
  • queries at gs://scylla-rdf-benchmark/queries/.

Metrics

  • ttfb - time to first byte
  • ttlb - time to load bytes

Results

About


Languages

Language:Kotlin 100.0%