semagrow / semagrow

A SPARQL query federator of heterogeneous data sources

Home Page:https://semagrow.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Execution plan reproducibility

stasinos opened this issue · comments

In a situation like this:

BindJoin
  Plan@local-semagrow[costs [5020.05257,0] 2005257 tuples]
    ...
  Plan@local-semagrow[costs [5020.05257,0] 2005257 tuples]
    ...

where both subqueries have exactly the same cost, the optimizer can order differently the sub-trees of the BindJoin in different runs. Although strictly speaking not a mistake, it would be preferable to produce identical plans for the same inputs every time the planner is executed.

We observed this behaviour while looking into Issue #49 . Although that issue was resolved by correcting the cost estimator, the reproducibility issue remains should the two sub-plans really have the same cost.