seralf / daf-semantic-validator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

semantic_validator

The Semantic Validator is a component (based on RDF4J) designed to provide a simple way for validating RDF metadata dataset against a specific Ontology on an underlying triplestore.

The triplestore used is in-memory, rules actually are sparql queries executed in distinct repositories for the sigle validation request, in order to isolate the validations.

semantic_repository component inside the semantic_manager architecture

HTTP API

The validator is currently based on a set of queries (about 150 for DCAT-AP_IT) returning a record of information for the rules broken by the dataset, the most important infos are:

  • Class name: the class involved in the rule (ex: Organization for DCAT-AP_IT)
  • Rule ID: the broken rule id (ex: 207 for DCAT-AP_IT)
  • Error description: the problem description (ex: "vcard:hasURL should be a resource" for DCAT-AP_IT)

there are two endpoints:

  • /validate : in order to validate a document
  • /validators: in oder to ghe the list of available validators

instructions

  1. compile / package
$ sbt clean package
  1. run
$ sbt run

Validation rules

The validator is now configured with three vocabularies:

The directory stucture is modular in order to add new validators, rules and methods of validation.

The validators have to be configured in the validator.conf file under the conf directory

Semantic_Validator
│ README.md
│ build.sbt
│ ...
│
└-conf
│ application.conf
│ validator.conf
│ ...
└-dist
  └-──data
    └-──ontologies
      └-──agid
        └-──DCAT-AP_IT
          │ DCAT-AP_IT.owl
          │ Licenze.ttl
          │ vcard-ns.ttl
          │ ...
          └-─validators
           └-─sparql
              rule-0.rq
              rule-1.rq
              rule-2.rq
              ...

OWL and ttl files in the specific ontology directory (ex: dist/data/ontologies/agid/DCAT-AP_IT) and the dataset sended to the service, are loaded together into the repository (in-memory) in order to use RDFS inference during the validation.


running / testing the microservice

The most simple way to test the application locally, is running it directly from sbt

$ sbt clean compile
$ sbt run

Another option is to prepare the distribution to deploy with sbt, then run it. For example we could:

$ sbt clean dist
$ unzip -o -d  target/universal/ target/universal/semantic_validator-1.0.1.zip
$ cd target/universal/semantic_validator-1.0.1
$ bin/semantic_validator -Dconfig.file=./conf/production.conf

Preparing a docker image is strightforward as:

$ sbt docker:stage
$ sudo docker build target/docker/stage/
$ sudo docker run -p 9000:9000 {image_id}

And finally we can build and publish locally the docker image to run, by:

$ sbt clean compile
$ sudo sbt docker:publishLocal
$ sudo docker run -p 9000:9000 {image_id}

TODO

  • add an implementation for shacl validations
  • add a CPSV-AP_IT rules set for the validator
  • ...

SEE ALSO: teamdigitale/daf

[README last updated: 2018-05-22]

About


Languages

Language:Scala 82.2%Language:HTML 17.8%