qgis / QGIS-Enhancement-Proposals

QEP's (QGIS Enhancement Proposals) are used in the process of creating and discussing new enhancements for QGIS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Buffered Transactional Editing

m-kuhn opened this issue · comments

Buffered Transactional Editing

Date 2020/12/03

Author Matthias Kuhn

Contact matthias@opengis.ch

maintainer @m-kuhn

Version QGIS 3.18 or 3.20

Summary

A new "buffered transactions" editing mode for QGIS is added.
With this edit mode, all editable layers are toggled synchronously and all edits are saved in a local edit buffer.
Saving changes is executed within a single transaction on all layers (per provider).

Reasoning / limitations of the status quo

QGIS currently supports 2 types of editing.
They both have their advantages and disadvantages.

Local edit buffer

Edits are buffered locally before being sent to the data provider.
The user saves each layer individually by toggling the edit mode.

For flat layers (e.g. simple shapefiles) this works very well.

When providers with foreign keys (parent child etc) come into play,
things become more complex because layers can no longer be treated independently.

  • A user has to know very well in which order layers have to be saved.
  • For editing data on multiple layers (commonly parent/child relationships)
    many layers have to be put into edit mode (more clicking).
  • Even on a single layer there is currently no transaction safety.
    E.g. a pg table with a line and a constraint on a field: if a line is split and the
    newly created part does not fulfill the constraint, the existing line will be
    shortened but the new one not added (there was an issue I can't find right now).

Transaction groups

Edits are sent to the data provider immediately while editing.
Multiple layers on the same provider are put into a transaction group and can be committed / rolled back
synchronously.

This approach helps with foreign keys and transaction safety. There are a couple of caveats still.

  • The transaction is kept open for a long time. This introduces table locks and therefore prevents
    users from working on the database in parallel, even on different areas of the data. (postgres)
  • It's hard to impossible to fixup data. Common use cases are
  • This has performance impacts because
    • while editing we have constant I/O going on
    • while an r/w connection is open, we cannot use any other connection, hence no parallel rendering
  • This can completely freeze GeoPackages due to internal locks (e.g. when a "default value" is based on the sqlite_fetch_and_increment expression function).

Proposed Solution

A new buffered transaction mode is introduced as a project configuration.

As in the current transaction mode, multiple layers are put into edit mode in a grouped mode.
All editable layers are put into edit mode and committed in parallel (in contrast to the transactional editing, where they need to be on the same provider).
All editing is done locally, no writes to the provider occur during editing.

When the user commits the changes

  • resolve layer dependencies through project relations
  • start a new transaction on each involved provider (provider in this context means they share the connection string as in QgsTransaction::connectionString())
  • change fields (add new fields, delete fields)
  • delete all features, in reverse dependency order (children first)
  • add all features, in forward dependency order (parents first)
  • change all attributes in reverse dependency order (children first)
  • change all geometries in reverse dependency order (children first, ideally this and the step before are merged, out of scope for this discussion)
  • if everything went well, commit
    • if commits went well discard edit buffer
  • if there was a problem, rollback
    • the edit buffer is unchanged

API additions

class QgsEditBufferGroup

A new class that keeps a list of edit buffers that it manages and commits or rolls back together.

QgsVectorLayerEditBuffer::setEditBufferGroup() and QgsVectorLayerEditBuffer::editBufferGroup()

If an editBuffer is part of an editBufferGroup it will forward commit and rollback commands to this one which invokes individual addFeature, deleteFeature, ... in the correct order across all contained editBuffers.

QgsMapLayer::setProject() and QgsMapLayer::project()

QgsMapLayer will receive knowledge of the project it is in. When a layer is registered in a project, the parent project will be set on the layer.

Will be used in QgsVectorLayer::startEditing(), QgsVectorLayer::commitChanges(), QgsVectorLayer::rollbackChanges() to forward edit requests to their QgsProject equivalent. They will be recursion guarded (since they will be called by QgsProject::... as well).

QgsProject::startEditing( layer ), QgsProject::commitChanges( layer ), QgsProject::rollbackChanges( layer )

Will start editing (or commit/rollback) either a single layer or all editable layers, depending on QgsProject::transactionMode().
Should be used in the future as the main entry point for editing layers in a project. The current way will keep working though.
Will create a new QgsEditBufferGroup if appropriate and add any editBuffer from layers that have been put in edit mode into.

Limitations

  • When an involved provider does not support transactions (shapefiles, excel files, etc) it is not possible to rollback if committing fails on another layer. In this case the layer might end up with stored data after an incomplete commit.
  • When a project has circular dependencies through foreign keys, we are not able to completely resolve the layer save order into a "correct" order. In this case the provider is required to be tolerant (e.g. deferred constraint checks).
    It would be possible to track dependencies down to individual features, but even there could potentially be remaining circular dependencies. In the end we try our best to handle the trivial cases and have to rely on the user for the complex scenarios.
  • In contrast to the existing transaction mode, side effects introduced on the provider through triggers etc. are not immediately visible.

Performance Implications

During editing performance is equal to the current edit buffer.
Performance is better than transaction groups during editing since nothing needs to be stored on the data provider. Also parallel is still enabled.

Backwards Compatibility

This is a new mode which is opt-in.

Issue Tracking ID(s)

(optional)

Votes

(required)

Hi Matthias, great idea !

Can you elaborate why QgsMapLayer::setProject() is necessary?

If mapLayer->startEditing() is called, it needs to forward this to the project so it can put all other editable layers into edit mode. Same for commitChanges and rollbackChanges. I think having not only a link from project to "child" layers but also from layers to "parent" project has its advantages for context.

There would be other approaches too which can be discussed if this one has unforeseen drawbacks. Do you see any specific drawbacks @elpaso ?

Sidenote: I also could imagine it will be handy to compile get_feature, aggregate and other join-based functionality in the future which could give a tremendous speed improvement in some cases.

An alternative approach would be to use the existing QObject parent's (QgsMapLayerStore) parent (QgsProject). That could live without a new member and would work "reasonably well" if protected by a solid set of unit tests.

@m-kuhn I don't think it's a good idea to have a cyclic dependency between project and map layers.

I'm -1 to add a project member to map layers.

Cannot you just handle this at the application level?

@elpaso the pull request qgis/QGIS#40745 avoids the member while keeping the logic contained, does that work for you?

I am worried that keeping this on the app level will make it a fragile construct with connections between signals and slots that will add public api for internal reasons, it will be hard to debug and will be hard to maintain. If somehow possible, I'd like to avoid this unless there is a very good reason.

@elpaso the pull request qgis/QGIS#40745 avoids the member while keeping the logic contained, does that work for you?

Yes! Sorry, I didn't look at he code and I thought it was a member.

Thanks. Your support is appreciated.

can we call for a vote on this?

LGTM in general.

I'd like to know how do you plan to handle the UX in case some layers belongs to providers that do not support rollback, I think the user should be warned about the potential data corruption in that case.

This is a good idea, +1 for me