Buffered Transactional Editing

Question

Buffered Transactional Editing

m-kuhn opened this issue 4 years ago · comments

Matthias Kuhn commented 4 years ago

Buffered Transactional Editing

Date 2020/12/03

Author Matthias Kuhn

Contact matthias@opengis.ch

maintainer @m-kuhn

Version QGIS 3.18 or 3.20

Summary

A new "buffered transactions" editing mode for QGIS is added.
With this edit mode, all editable layers are toggled synchronously and all edits are saved in a local edit buffer.
Saving changes is executed within a single transaction on all layers (per provider).

Reasoning / limitations of the status quo

QGIS currently supports 2 types of editing.
They both have their advantages and disadvantages.

Local edit buffer

Edits are buffered locally before being sent to the data provider.
The user saves each layer individually by toggling the edit mode.

For flat layers (e.g. simple shapefiles) this works very well.

When providers with foreign keys (parent child etc) come into play,
things become more complex because layers can no longer be treated independently.

A user has to know very well in which order layers have to be saved.
For editing data on multiple layers (commonly parent/child relationships)
many layers have to be put into edit mode (more clicking).
Even on a single layer there is currently no transaction safety.
E.g. a pg table with a line and a constraint on a field: if a line is split and the
newly created part does not fulfill the constraint, the existing line will be
shortened but the new one not added (there was an issue I can't find right now).

Transaction groups

Edits are sent to the data provider immediately while editing.
Multiple layers on the same provider are put into a transaction group and can be committed / rolled back
synchronously.

This approach helps with foreign keys and transaction safety. There are a couple of caveats still.

The transaction is kept open for a long time. This introduces table locks and therefore prevents
users from working on the database in parallel, even on different areas of the data. (postgres)
It's hard to impossible to fixup data. Common use cases are
- adding a new row through the attribute table
- copy/pasting features, worked around through
This has performance impacts because
- while editing we have constant I/O going on
- while an r/w connection is open, we cannot use any other connection, hence no parallel rendering
This can completely freeze GeoPackages due to internal locks (e.g. when a "default value" is based on the sqlite_fetch_and_increment expression function).

Proposed Solution

A new buffered transaction mode is introduced as a project configuration.

As in the current transaction mode, multiple layers are put into edit mode in a grouped mode.
All editable layers are put into edit mode and committed in parallel (in contrast to the transactional editing, where they need to be on the same provider).
All editing is done locally, no writes to the provider occur during editing.

When the user commits the changes

resolve layer dependencies through project relations
start a new transaction on each involved provider (provider in this context means they share the connection string as in QgsTransaction::connectionString())
change fields (add new fields, delete fields)
delete all features, in reverse dependency order (children first)
add all features, in forward dependency order (parents first)
change all attributes in reverse dependency order (children first)
change all geometries in reverse dependency order (children first, ideally this and the step before are merged, out of scope for this discussion)
if everything went well, commit
- if commits went well discard edit buffer
if there was a problem, rollback
- the edit buffer is unchanged

API additions

`class QgsEditBufferGroup`

A new class that keeps a list of edit buffers that it manages and commits or rolls back together.

`QgsVectorLayerEditBuffer::setEditBufferGroup()` and `QgsVectorLayerEditBuffer::editBufferGroup()`

If an editBuffer is part of an editBufferGroup it will forward commit and rollback commands to this one which invokes individual addFeature, deleteFeature, ... in the correct order across all contained editBuffers.

`QgsMapLayer::setProject()` and `QgsMapLayer::project()`

QgsMapLayer will receive knowledge of the project it is in. When a layer is registered in a project, the parent project will be set on the layer.

Will be used in QgsVectorLayer::startEditing(), QgsVectorLayer::commitChanges(), QgsVectorLayer::rollbackChanges() to forward edit requests to their QgsProject equivalent. They will be recursion guarded (since they will be called by QgsProject::... as well).

`QgsProject::startEditing( layer )`, `QgsProject::commitChanges( layer )`, `QgsProject::rollbackChanges( layer )`

Will start editing (or commit/rollback) either a single layer or all editable layers, depending on QgsProject::transactionMode().
Should be used in the future as the main entry point for editing layers in a project. The current way will keep working though.
Will create a new QgsEditBufferGroup if appropriate and add any editBuffer from layers that have been put in edit mode into.

Limitations

When an involved provider does not support transactions (shapefiles, excel files, etc) it is not possible to rollback if committing fails on another layer. In this case the layer might end up with stored data after an incomplete commit.
When a project has circular dependencies through foreign keys, we are not able to completely resolve the layer save order into a "correct" order. In this case the provider is required to be tolerant (e.g. deferred constraint checks).
It would be possible to track dependencies down to individual features, but even there could potentially be remaining circular dependencies. In the end we try our best to handle the trivial cases and have to rely on the user for the complex scenarios.
In contrast to the existing transaction mode, side effects introduced on the provider through triggers etc. are not immediately visible.

Performance Implications

During editing performance is equal to the current edit buffer.
Performance is better than transaction groups during editing since nothing needs to be stored on the data provider. Also parallel is still enabled.

Backwards Compatibility

This is a new mode which is opt-in.

Issue Tracking ID(s)

(optional)

Votes

(required)

Régis Haubourg · Answer 1 · Thu Dec 03 2020 16:24:18 GMT+0800 (China Standard Time)

Hi Matthias, great idea !

Alessandro Pasotti · Answer 2 · Fri Dec 11 2020 19:17:42 GMT+0800 (China Standard Time)

Can you elaborate why QgsMapLayer::setProject() is necessary?

Matthias Kuhn · Answer 3 · Fri Dec 11 2020 20:02:33 GMT+0800 (China Standard Time)

If mapLayer->startEditing() is called, it needs to forward this to the project so it can put all other editable layers into edit mode. Same for commitChanges and rollbackChanges. I think having not only a link from project to "child" layers but also from layers to "parent" project has its advantages for context.

There would be other approaches too which can be discussed if this one has unforeseen drawbacks. Do you see any specific drawbacks @elpaso ?

Sidenote: I also could imagine it will be handy to compile get_feature, aggregate and other join-based functionality in the future which could give a tremendous speed improvement in some cases.

Matthias Kuhn · Answer 4 · Tue Dec 22 2020 02:54:41 GMT+0800 (China Standard Time)

An alternative approach would be to use the existing QObject parent's (QgsMapLayerStore) parent (QgsProject). That could live without a new member and would work "reasonably well" if protected by a solid set of unit tests.

Alessandro Pasotti · Answer 5 · Thu Dec 24 2020 01:54:53 GMT+0800 (China Standard Time)

@m-kuhn I don't think it's a good idea to have a cyclic dependency between project and map layers.

I'm -1 to add a project member to map layers.

Cannot you just handle this at the application level?

Matthias Kuhn · Answer 6 · Thu Dec 24 2020 02:05:56 GMT+0800 (China Standard Time)

@elpaso the pull request qgis/QGIS#40745 avoids the member while keeping the logic contained, does that work for you?

I am worried that keeping this on the app level will make it a fragile construct with connections between signals and slots that will add public api for internal reasons, it will be hard to debug and will be hard to maintain. If somehow possible, I'd like to avoid this unless there is a very good reason.

Alessandro Pasotti · Answer 7 · Thu Dec 24 2020 02:11:26 GMT+0800 (China Standard Time)

@elpaso the pull request qgis/QGIS#40745 avoids the member while keeping the logic contained, does that work for you?

Yes! Sorry, I didn't look at he code and I thought it was a member.

Matthias Kuhn · Answer 8 · Thu Dec 24 2020 02:19:03 GMT+0800 (China Standard Time)

Thanks. Your support is appreciated.

Denis Rouzaud · Answer 9 · Tue Oct 26 2021 21:50:06 GMT+0800 (China Standard Time)

can we call for a vote on this?

Alessandro Pasotti · Answer 10 · Tue Oct 26 2021 21:57:22 GMT+0800 (China Standard Time)

LGTM in general.

I'd like to know how do you plan to handle the UX in case some layers belongs to providers that do not support rollback, I think the user should be warned about the potential data corruption in that case.

Julien Cabieces · Answer 11 · Wed Oct 27 2021 14:32:31 GMT+0800 (China Standard Time)

This is a good idea, +1 for me

Buffered Transactional Editing

Buffered Transactional Editing

Summary

Reasoning / limitations of the status quo

Local edit buffer

Transaction groups

Proposed Solution

API additions

class QgsEditBufferGroup

QgsVectorLayerEditBuffer::setEditBufferGroup() and QgsVectorLayerEditBuffer::editBufferGroup()

QgsMapLayer::setProject() and QgsMapLayer::project()

QgsProject::startEditing( layer ), QgsProject::commitChanges( layer ), QgsProject::rollbackChanges( layer )

Limitations

Performance Implications

Backwards Compatibility

Issue Tracking ID(s)

Votes

`class QgsEditBufferGroup`

`QgsVectorLayerEditBuffer::setEditBufferGroup()` and `QgsVectorLayerEditBuffer::editBufferGroup()`

`QgsMapLayer::setProject()` and `QgsMapLayer::project()`

`QgsProject::startEditing( layer )`, `QgsProject::commitChanges( layer )`, `QgsProject::rollbackChanges( layer )`