weso / RemoteHDT

HDT implementation using Zarr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Installation

Instructions

Official releases

You can download a binary from the latest release page.

Compiling from source

shex-rs has been implemented in Rust and is compiled using cargo. The command cargo run can be used to compile and run locally the code.

Docker

TBD

Usage

Validate an example

The folder examples contains several example files.

Serialize a concrete RDF dataset
wget https://github.com/weso/RemoteHDT/blob/master/resources/1-lubm.ttl
remote-hdt --rdf 1-lubm.ttl --serialize

This project contains an exploration on ways to replicate HDT using ZARR.

  1. We have to be able to import data from RDF dumps
  2. Then we have to load them into the Application
  3. Lastly, the loaded data should be serialized into ZARR

This project could be divided into two main crates:

  1. RemoteHDT --> The HDT fork using ZARR
  2. rdf-rs --> utilities for importing RDF dumps using Rust

.
├── \*.zarr # Resulting Zarr project
├── rdf-rs # Crate for importing the RDF dumps into the system
├── examples
├── src
│ ├── zarr # All the Zarr utilities
│ └── main.rs # Main application for creating the Zarr project
└── ...

X axis --> subjects Y axis --> predicates Z axis --> objects

!Caveat Unique values should be stored in each of the axis

For each triple, a 1 will be set to each (X, Y, Z) such that (X, Y, Z) = (s, p, o). -1 otherwise.


Sprint 1

  1. Support several systems of reference; namely, SPO, POS, OSP...
  2. Explore the Linked Data Fragments project
  3. Streaming + Filtering = Larger than RAM?
  4. Quality attributes: synchronization, size of the original dump...
  5. HDTCat --> Larger than RAM HDT, while RemoteHDT --> Remote HDTCat?
  6. Serverless Linked Data Fragments?
  7. Benchmarking
  8. Store the HashSet inside the Zarr directory (somewhere)

Sprint 2

  1. Work on the quality attributes and features that we are good at

Sprint 4

  1. LUBM benchmarks

Sprint 5

  1. Create a Shape for the LUBM benchmarks.

About

HDT implementation using Zarr


Languages

Language:Rust 99.0%Language:Dockerfile 0.9%Language:Shell 0.1%