casework / CASE

Cyber-investigation Analysis Standard Expression (CASE) Ontology

Home Page:https://caseontology.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Must an `InvestigativeAction` always have a `ProvenanceRecord` among its outputs?

ajnelson-nist opened this issue · comments

The CASE Ontology Committee had discussed in the past whether InvestigativeActions would always have a ProvenanceRecord among their outputs. I recall, informally, we had said yes, this is a requirement. However, we had not encoded this in SHACL or OWL.

I have found a few logistical issues with requiring a ProvenanceRecord as output. While I don't think these are necessarily counter-arguments, they seem to need clarification if we move towards encoding the generated-ProvenanceRecord expectation.

  1. A ProvenanceRecord must have at least one member. This is a requirement inherited from uco-core:ContextualCompilation, ProvenanceRecord's superclass. By my understanding of what CASE had not formally encoded, that ProvenanceRecord should have members that are either (1) inputs to the InvestigativeAction, or (2) other results of the InvestigativeAction.
  2. I do not think uco-action:subaction was considered as part of the discussion, because it had not been exercised in CASE-Examples or the CASE website.1 It is not quite clear how that property is supposed to be used with InvestigativeAction, namely whether any sub-action of an InvestigativeAction is also an InvestigativeAction. The answer to that question might complicate requiring a ProvenanceRecord as output.

Let's take this example graph, which renders an action that takes a JPEG file as input and uses a (made-up) tool, "ExampleJpegAnalyzer," to analyze the JPEG's contents in a couple ways. The tool unconditionally calls multiple independent, tool-internal functions as part of its execution, look_up_location, ocr and others. The ocr function yields a file. This graph omits some triples for the sake of discussion.

kb:tool-1
	a uco-tool:AnalyticTool ;
	uco-core:name "ExampleJpegAnalyzer" ;
	.

kb:jpeg-i1
	a uco-observable:RasterPicture ;
	.
kb:provenance-record-i1
	a case-investigation:ProvenanceRecord ;
	uco-core:object kb:jpeg-i1 ;
	.

kb:action-1
	a case-investigation:InvestigativeAction ;
	uco-action:instrument kb:tool-1 ;
	uco-action:object
		kb:jpeg-i1 ,
		kb:provenance-record-i1
		;
	uco-action:subaction kb:action-2 ;
	uco-action:result kb:provenance-record-o1 ;
	.

kb:action-2
	a uco-action:Action ;
	uco-core:description "Store any OCR-recognized text in a file." ;
	uco-action:object kb:jpeg-i1 ;
	uco-action:result kb:ocr-text-results-file-1 ;
	.
kb:ocr-text-results-file-1
	a uco-observable:File ;
	.

Question 1: Is kb:action-2 a InvestigativeAction? If so, and if all InvestigativeActions need to generate a ProvenanceRecord, how do the members of its ProvenanceRecord relate to the parent action's ProvenanceRecord?

Question 2: What are the members of the output ProvenanceRecord, kb:provenance-record-o1?

Question 2.1: Is kb:jpeg-i1 a member, recording that it was seen and/or handled?

Question 2.2: Is kb:ocr-text-results-file-1 in kb:provenance-record-o1? Is the answer to this influenced by whether kb:action-2 is or is not a InvestigativeAction?

I intend to take responses to these questions and propose OWL and SHACL encodings to capture the consensus.

Footnotes

  1. To date, subaction still has not been exercised in either of those repositories. It is exercised in CASE-Corpora, and a recent update in testing infrastructure triggered a data validation error in a sketch of mine, which led to this Question post.

The CASE Ontology Committee had discussed in the past whether InvestigativeActions would always have a ProvenanceRecord among their outputs. I recall, informally, we had said yes, this is a requirement. However, we had not encoded this in SHACL or OWL.

Part of the design discussions around InvestigativeAction and why it was created as something distinct from just Action was that it was paired with ProvenanceRecord to form the mechanism for tracking provenance of objects through an investigative process.
It was definitely intended that an InvestigativeAction would always have a ProvenanceRecord among their action:result. I agree that this was never formally codified in the OWL or SHACL.

A ProvenanceRecord must have at least one member. This is a requirement inherited from uco-core:ContextualCompilation, ProvenanceRecord's superclass. By my understanding of what CASE had not formally encoded, that ProvenanceRecord should have members that are either (1) inputs to the InvestigativeAction, or (2) other results of the InvestigativeAction.

A ProvenanceRecord would never contain/reference objects that are/were inputs to the InvestigativeAction that produced the ProvenanceRecord. It would only contain objects resulting from the InvestigativeAction. InvestigativeAction1 could have ProvenanceRecord1 in its results and objects referenced within ProvenanceRecord1 (that resulted from InvestigativeAction1) could be used as inputs to InvestigativeAction2 in a way that lets you chain inputs to actions to results which may be inputs to other actions.

I do not think uco-action:subaction was considered as part of the discussion, because it had not been exercised in CASE-Examples or the CASE website.1 It is not quite clear how that property is supposed to be used with InvestigativeAction, namely whether any sub-action of an InvestigativeAction is also an InvestigativeAction. The answer to that question might complicate requiring a ProvenanceRecord as output.

subaction was considered as part of the discussion.
subaction allows complex actions to be described as a single overall Action made up of multiple more atomic subactions.
Consider something like Action="Start the car". This could be thought of as one overall action for some contexts but it likely consists of multiple subactions such as "insert key", "turn key to auxiliary position", and "turn key to start position". Each of those subactions is an independent action itself that can have performer, object, instrument, result, etc property values. The overall "Start the car" Action would reference each of the separate subactions using its action:subaction property and the action:result property could either be a union of the results of the subaction or just a subset of them depending on context.
All of this would also be true for InvestigativeAction except that the ProvenanceRecord for the overall action should likely always be a full union of the contents of the ProvenanceRecords for all of the subactions.
It should be pretty straightforward.

I find it a bit hard to interpret and comment on the example graph since neither action-1 or action-2 specify the core:name of the action (what is the action being performed).
Typically an Action that contains subactions would contain more than a single subaction but I guess that is not a hard requirement. It would just be more common and easier to understand.

Question 1: Is kb:action-2 a InvestigativeAction? If so, and if all InvestigativeActions need to generate a ProvenanceRecord, how do the members of its ProvenanceRecord relate to the parent action's ProvenanceRecord?

Given that InvestigativeAction is defined as "An investigative action is something that may be done or performed within the context of an investigation, typically to examine or analyze evidence or other data." it seems logical that any subaction of an InvestigativeAction should likely also be an InvestigativeAction.
The ProvenanceRecords of the subactions would relate to the ProvenanceRecord of the overall action as described in my comment above (overall ProvenanceRecord would be a union of contents of subaction ProvenanceRecords and likely should include the subaction ProvenanceRecords themselves).

Question 2: What are the members of the output ProvenanceRecord, kb:provenance-record-o1?

That would completely depend on what the actual actions were here which is not clearly specified.
In a simple case as shown in the example where the overall action contained only a single subaction (which as described in the above comment should likely be an InvestigativeAction with its own ProvenanceRecord) and did nothing else besides the subaction then the members of kb:provenance-record-o1 would likely be the ProvenanceRecord resulting from action-2 along with its contents.

Question 2.1: Is kb:jpeg-i1 a member, recording that it was seen and/or handled?

Again, I think that depends on the nature of the actual action in action-1 which is not specified.
If it did not change kb:jpeg-i1 in any way then it should not be included in kb:provenance-record-o1 as it is only an input to action-1 and not an output.
If it did change kb:jpeg-i1 in any way then it should be included in kb:provenance-record-o1 as it is not only an input but also an output.

Question 2.2: Is kb:ocr-text-results-file-1 in kb:provenance-record-o1? Is the answer to this influenced by whether kb:action-2 is or is not a InvestigativeAction?

As described in the above comments, yes it would be in kb:provenance-record-o1 because action-2 should be an InvestigativeAction with its own ProvenanceRecord (let's call it kb:provenance-record-o2). kb:provenance-record-o2 would contain kb:ocr-text-results-file-1 and kb:provenance-record-o1 would contain both kb:provenance-record-o2 and kb:ocr-text-results-file-1.