Updating relations gets gradually slower

Question

Updating relations gets gradually slower

SignIn-SignUp opened this issue 4 years ago · comments

I could reproduce this in Haskell Code as well as in tutd.
When updating a relvar like the the one created by the following commands:

products :: {name Text, age Integer}
products := relation{tuple{name "Mike",age 6},tuple{name "Sam",age 10}}
key products_key_constraint {name} products

And then updating it with that command alternating the value we want to set age to:

update products where name="Sam" (age := 11)

The updates get gradually slower after a few updates. In my usecase the same result without the performance decrease can be achieved (at least when using it in code) by deleting the tuple and then inserting the "updated" tuple.

When profiling the majority of time is spent in ProjectM36.Relation.mkRelation.duplicateAttrNames and ProjectM36.Attribute.attributesDifference. But I am guessing that this is normal and maybe unrelated although in some extreme cases the percentage was about 40%. The slowdown feels almost exponential or at least of a high polynomial complexity.

Is this intentional or am I just using something wrong ?
Does this issue arise for others as well ?

AgentM · Answer 1 · Mon Apr 19 2021 02:06:48 GMT+0800 (China Standard Time)

Indeed! You have discovered a major architectural difference of Project:M36 when compared to other DBMS products. This slowdown is a consequence of Project:M36's novel architecture which maximizes laziness and data independence. You can read more about the laziness in this documentation.

The architecture calls for a caching system based on memoization to mitigate the laziness, but it is not yet implemented. In the meantime, the mental model for this slowdown is that the idempotent database updates "stack" on each other, i.e. all previous updates necessary to generate the answer to the query are re-executed.

SignIn-SignUp · Answer 2 · Mon Apr 19 2021 17:02:26 GMT+0800 (China Standard Time)

Ok, thank you! Thats eye opening.
Is this comparable to the logical logging in some other systems wehere the operation is saved but not the corresponding values?
Is deleting and inserting a valid alternative at the moment if performance is somewhat needed?

AgentM · Answer 3 · Tue Apr 20 2021 00:03:41 GMT+0800 (China Standard Time)

Yes, that's a similar concept except that Project:M36 logs the actual updates and only materializes the necessary tuples (often a subset of tuples needed only to service the query-at-hand) on demand. This allows Project:M36 to advertise commits in constant time O(1) instead of O(n) where n is the number of tuples altered.

As a consequence of this choice, however, all updates must be idempotent (repeatable) and this is strongly enforced which should make replication also much easier to reason about.

SignIn-SignUp · Answer 4 · Thu Apr 22 2021 18:44:57 GMT+0800 (China Standard Time)

Thank you! I will close the issue then.

AgentM · Answer 5 · Thu Apr 22 2021 23:56:28 GMT+0800 (China Standard Time)

No problem. The current non-cached version is not ideal, so we hope to improve this in an upcoming release shortly. Feel free to join in IRC at chat.freenode.net #project-m36 as well