How to Model Semi-Persistent Facts

Question

How to Model Semi-Persistent Facts

PhilipLukertWork opened this issue 2 years ago · comments

Semi-Persistent Facts

I would like to share some knowledge on how to model semi-persistent facts. Maybe they find their way into the manual at some point or (ideally) they are implemented directly in Tamarin.

Use Case

While persistent facts are read-only-and-never-change and linear facts are destroy/consume-on-read, we often need a thing inbetween. For example modeling a storage cell with an identifier ~id where we sometimes want to read the value but sometimes also want to change it.

Implementation with Linear Facts (not recommended)

With that, the exact same Fact has to be inserted again in the Read-rule such that it is not destroyed when reading it:

rule Init:  [Fr(~id), In(val)]           --> [Cell(~id, val)]
rule Read:  [Cell(~id, val)]             --> [Cell(~id, val), Out(val)]
rule Write: [Cell(~id, oldVal), In(val)] --> [Cell(~id, val)]

This has one big disadvantage: Often in proofs, we need to find the source of a value. For example "where did that secret key come from". If we have some Read-rules between writing and the current point, Tamarin is easily trapped into a loop or has to unfold many Read-steps before finding the source - the rule where the value was actually written.

Implementation with Persistent Facts and Restrictions

The other idea is to use persistent facts as follows:

rule Init:
    [Fr(~id), In(val)]
    --[WriteCell(~id, val)]->
    [!Cell(~id, val)]

rule Read:
    [!Cell(~id, val)]
    --[ReadCell(~id, val)]->
    [Out(val)]

rule Write:
    [In(~id), In(val)]
    --[WriteCell(~id, val)]->
    [!Cell(~id, val)]

restriction StorageCell:
    "∀ id  c #i.  ReadCell(id, c)@i
    ==>  ∃   #j. WriteCell(id, c)@j ∧ j<i
      ∧ ¬∃ d #k. WriteCell(id, d)@k ∧ j<k ∧ k<i"

Note that we have to add action facts to the rules now such that the restriction can talk about them.
The restriction then ensures that when reading from a cell, there has to be a WriteCell with the same value which was not overwritten.

That approach works but has some caveats. Notably, the restriction will spawn a WriteCell goal for each ReadCell. This can cause looping problems if ReadCell and WriteCell are in the same rule. But even without that, it creates additional goals which are not needed as there is already a goal that aims to satisfy the source of the persistent fact !Cell(...). A better way to write what we want in a restriction is actually:

restriction StorageCellBetter:
    "∀ id c d #i #j #k. WriteCell(id, c)@i
                      ∧ WriteCell(id, d)@j ∧ i<j
                      ∧  ReadCell(id, c)@k ∧ j<k
    ==> ∃ #l.           WriteCell(id, c)@l ∧ ¬l<j ∧ l<k"

Here we let the persistent fact goals for !Cell(...) do their job until a "bad" situation occurs where we have a second WriteCell@j inbetween a ReadCell@i and a WriteCell@k. In that case we require a third WriteCell@l which is after the second WriteCell@j (or equal to it in case of c=d).

This way of writing the restriction is much less invasive as it triggers only if we have a "bad" situation to "heal" it. Thus, it does not pollute our goals.

Separating cells

Quite often, we want to store multiple values grouped together. For example on a device ~id we store a long term key ltk and an ephemeral key eph which is changed each protocol iteration. Then we can model this as Cell(~id, ltk, eph). However, if we want to find the source of the ltk, we would have to trace it back trough all changes of eph which usually requires to write (possibly multiple) loop breaker lemmas.

Instead we can model the same behaviour with separate facts LtkCell(~id, ltk) and EphCell(~id, eph) - deciding upon need which one is persistent, semi-persistent or linear. That has the advantage that it is now irrelevant how often we change eph between writing and reading ltk because we directly jump to the source of ltk. Note that you then need a restriction for each Cell name which would be StorageLtkCell and StorageEphCell in this case.

Be aware that this separation has disatvantages as well - especially when trying to prove connections between two separated values. And writing a fitting oracle might be harder because you have to deal with more goals. However, I found that it helps especially for more complex protocols where I had to write oracles anyways.

A Direct Implementation in Tamarin

There are multiple ways of dealing with the problems of semi-persistence and separation and I'm thinking about ways to implement them. Things that I thought about:

Adding the action facts is quite nasty and a source of error if they become out of sync with the persistent facts when changing the model
Letting the user write the restrictions is a source of error
Thus, it would be nice to have a new type of fact (maybe called storage-fact) which does effectively what we do with persistent facts and restrictions.
- If we use a constraint rewriting rule for that (instead of a restriction) we can directly refer to the rule that satisfied the premise fact instead of using an existential quantification as a hack to refer to it
We can automatically detect linear facts that should be rewritten as storage facts and hint the user
Automatically (internally) splitting facts is harder because it is not always beneficial - though it is definitely worth looking into them. One thought was to write Cell(~id, a, b, _, _, _) if only a and b are read and the others are unimportant for that specific rule.

Cas Cremers · Answer 1 · Fri May 27 2022 00:07:57 GMT+0800 (China Standard Time)

Thanks @PhilipLukertWork !
Two questions:
Q1 - For the "better" version, why not use something like

for all x y #i #j . ( write(x)@i 
                  and read(y)@j and i<j 
                  and x!=y ) 
=> exist y #k . ( write(y)@k and i<k and k<j )

(or even possibly with the negation version for not k<i), but at least dropping the first write.

Q2 - Should we add a "cell name" constant to the example, to help people wanting to model multiple cells? (I realize it can be model-dependent if this is beneficial.)

rkunnema · Answer 2 · Fri May 27 2022 16:02:11 GMT+0800 (China Standard Time)

Hi @PhilipLukertWork

that's super interesting stuff. The restriction StorageCell is basically the axiom SAPIC is using when no Deletes are present. If you want a quick & dirty way of benchmarking StorageCellBetter, you could just edit the restriction called resSetIn and forthcoming in lib/sapic/src/Sapic/Basetranslation.hs, they are string literals. It would allow you to rapidly see how much more efficient this is. If you don't do it, I will definitely give it a try when I come around to it!

On Q2: restriction StorageCell plus separating cells is what SAPIC does (plus deletes). If we want this as a source-to-source translation in Tamarin/MSR we should see:

if SAPIC's facilities can be reused
that SAPIC can reuse this translation, instead of implementing it again
discuss how to integrate the internal translation step properly. The current state is that we have translation being called all over Batch.hs and TheoryLoader.hs, see this issue: tamarin-prover/tamarin-prover#458 . I plan to systematize the way we do internal translations, but I could need some help.

Philip Lukert · Answer 3 · Fri May 27 2022 18:07:56 GMT+0800 (China Standard Time)

@cascremers
Q1: We want the restriction to trigger as few times as possible. Your restriction would be triggered even if we have the trace Write("a") → Write("b") → Read("b") which is not yet a "bad" behaviour.

To be honest, my restriction is also triggered sometimes in "non-bad" behaviours like Write("b") → Write("a") → Write("b") → Read("b"). We cannot avoid that unless we have a means to directly refer to the write exactly before the read (which is why implementing it in constraint rewriting rules gives benefits)

Q2: I added a clarifying sentence "Note that you then need a restriction for each Cell name [...]"

@rkunnema
I'll definitely talk to you when I start to implement this (which I plan to do some time in the future). If you didn't already do the benchmarking by then, I'll do that.

Cas Cremers · Answer 4 · Fri May 27 2022 18:21:02 GMT+0800 (China Standard Time)

@PhilipLukertWork I think your example also triggers your rule (for lack of c!=d)

rkunnema · Answer 5 · Mon Jan 23 2023 22:17:22 GMT+0800 (China Standard Time)

This discussion continues in the PR tamarin-prover/tamarin-prover#481