DLR-SC / prov-db-connector

PROV Database Connector

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove document based saving mechanism

B-Stefan opened this issue · comments

We plan to change the graph model to a non document based model.

Features:

  • One big graph per database
  • Ability to get parts of the graph (e.g. Give me all connected Nodes from "ex:Bob")
  • Maybe: Backwards compatibility to create_document and get_document

Tasks:

  1. Create test cases for merge mechanism
  2. Create merge mechanism for neo4j adapter (Include exception handling....)
  3. Add backwards compatibility for create_document and get_document
  4. Add query function like get_records_by_identifier(identifier)

Merge Mechanism

We discussed several methods to merge nodes and relations but we agreed that there are several cases with different demands, so we plan to implement several merging modes.

Merge modes

  • NO_MERGE - Nodes won't be merged, program raises exception
  • SOFT_MERGE - LATER Nodes will be merged, if it is possible without conflicts
  • OVERRIDE_MERGE - LATER If there are 2 properties with the same property label and different values the new value will override the old value

Merge assumptions

  • Nodes will be merged if all required arguemnts are the same
  • The identifier of a node (e.g. "http://example.com#Bob") is unique.
  • Relations are the same if all required arguments are the same

Found method that compares two edges:

sameEdge()-Method

They define two edges as the same if all formal arguments are the same. The formal attributes are different for each relation type, see here. I assume the same behaviour for comparing nodes.

I figured out that the primary example is a valid prov document (according to the prov validator) but it neglected the merge rules.

In detail: The example contains the edges:

used(ex:compose,ex:regionList,-)
used(ex:compose,ex:regionList,-,[prov:role = 'ex:regionsToAggregateBy'])

The problem is that the FORMAL_ATTRIBUTES for used are: PROV_ATTR_ACTIVITY, PROV_ATTR_ENTITY, PROV_ATTR_TIME. Based on this information the merge process would merge the 2 edges above into one.