qgis / QGIS-Enhancement-Proposals

QEP's (QGIS Enhancement Proposals) are used in the process of creating and discussing new enhancements for QGIS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Report of evaluation of migration to Qt-for-Python and proposal for a decision

3nids opened this issue · comments

Context

This is the report of the investigation of switching to Qt-for-Python for QGIS Python bindings.
This is based on the QEP #163

Approach

We have decided to give a trial at setting up Qt-for-Python to generate PyQGIS bindings.
All this work has been done and is available at https://github.com/opengisch/QGIS/tree/qt-for-python-qt5/qt-for-python.

A first try using PySide6 (Qt6) was started by making the code base more compliant with Qt6. This work has been merged to the main branch of the official repo, and thanks to other devs, a set of QGIS core tests is now running with Qt6 as well on CI.
Due to the incomplete state of Qt6 compatibility and amount of work required at that time, we have opted to use PySide2 (Qt5).

A global switch to enable PySide2 is available in the CMake configuration (https://github.com/opengisch/QGIS/blob/qt-for-python-qt5/CMakeLists.txt#L949). PySide2 and PyQt can live at the same time in the code base, so that Qt-for-Python can be evaluated while SIP/PyQt is still maintained.

The goal was to have the code to compile, to be able to call some PyQGIS object and run a test using the new bindings. This would allow us to estimate the effort required to migrate the bindings and spot places where the migration would be problematic.

Quick technical summary of Qt-for-Python

Qt-for-Python is maintained by Qt Group and is available under the same licence than Qt.

Qt-for-Python includes both

  • PySide being the Python bindings of the Qt libraries. Currently QGIS uses PyQt.
  • shiboken being the generator of the bindings for QGIS. Currently QGIS uses SIP. The generator uses both the QGIS core headers and auxiliary files (XML typesystem files)

Technical outcomes

Typesystem files generation

Shiboken, Qt-for-Python's generator, relies on XML auxiliary files to produce the bindings (similarly, sip relies on sip files).

With shiboken, members are automatically detected. This means that the typesystem file only contains:

  • enums
  • parts of the code which require special handling (ownership, argument manipulation, code injection)

The question rising is the way to write or produce these typesystem files.
One possibility is to write them manually as this used to be the case in the past for QGIS, before sipify was created. But it has proven to be problematic: the files were often outdated and not produced from start when developing new API endpoints.
The other approach is to produce them automatically based on the content of the header. We can either:

  1. write the whole content of the XML file in the header within #ifdefs
  2. Use a combination of simple macros (similarly to what is done currently, see https://github.com/qgis/QGIS/blob/master/src/core/qgis_sip.h) and #ifdefs for things like code injection. In such case, we then need a tool to produce the XML either using a non-parsing tool like sipify or a tool based on a code parser.

While the simplest approaches sound nice, they would probably prove to be too limitating, mainly for applying global changes for the API documentation for instance (automatically generating the Python docs from the C++ docs).

Qt has apparently developed a commercial GUI tool, but based on an open-source example https://code.qt.io/cgit/pyside/pyside-setup.git/tree/sources/shiboken6/tests/dumpcodemodel. We would recommend testing this tool to see if it can extended to our needs.

Ownership

https://doc.qt.io/qtforpython/shiboken6/typesystem_ownership.html

The ownership system of qt-for-python is well-documented including "common usage" examples.

It covers scenarios like transferring ownership both directions between python/c++. It also
covers concepts like the Qt parent/child relationship.
It even offers some "heuristics" to discover parent/child relationships, it needs to be
checked how these play in the QGIS scenario, very likely they will require to rename some
parameters to parent and/or be made explicit.
The documentation is encouraging and the fact that it has been used to create python bindings
for the Qt frameweork as well.
QGIS already now contains very good in-code annotations of memory management thanks to the work
for SIP compatibility.

Both facts, the existing annotations as well as the apparent good support for ownership management
of qt-for-python are encouraging for a smooth transition.
However, due to the fact that the different memory management models of C++ and Python offer a
challenge for any bindings, the risk persists that additional work on the mapping of SIP to the
qt-for-python models needs to be done. It's unlikely that this will be a blocker.

QVariant

QVariant was removed and is transparently converted to the corresponding native type.
Any function expecting it can receive any Python object (None is an invalid QVariant).
The same rule is valid when returning something: the returned QVariant will be converted to its original Python object type.

When a method expects a QVariant::Type the programmer can use a string (the type name) or the type itself.
https://pyside.github.io/docs/pyside/pysideapi2.html#qvariant

NULL

Since QVariant was removed, it might be an opportunity to drop QGIS' NULL in favor of Python's None.
Inside QGIS, invalid and null QVariants are treated equally, so this would help to be more aligned with
QGIS internal handling and being more Pythonic at the same time.

Compatibility layer

One of the goals of PySide6 is to be API compatible with PyQt, with certain exceptions (see https://doc.qt.io/qtforpython/considerations.html).

Therefore it would be easy and useful to propose a compatibility layer, similarly to what we did for PyQt5/6: i.e. qgis.PyQt.QtCore = PySide.QtCore.

Missing PySide bindings

Some objects are not (yet?) part of PySide bindings, especially in PySide2.
https://wiki.qt.io/Qt_for_Python_Missing_Bindings

New types get added regularly and in the list of missing bindings, no blocking type could be identified.
The risk of this is low.

Raising exceptions instead of returning tuples

In several places of QGIS' API, we tend to return a tuple with the result and with a boolean for the success (e.g. see QgsUnitTypes.stringToDistanceUnit method).
It would be more pythonic to raise Python exception instead, and only return the result.
This change would mean an API break, the opportunity could be taken to change this.

pyuic and pyrcc

The tools pyuic and pyrcc are utilities to compile .ui and .rc files to python files.

There is the possibility to load .ui and helper files at runtime and skipping the compiling
step altogether.
There are also equivalent tools available for Qt-for-Python (pyside6-uic and pyside6-rcc).

https://doc.qt.io/qt-6/uic.html
https://doc.qt.io/qtforpython/tutorials/basictutorial/qrcfiles.html

QScintilla

QScintilla needs to be ported to PySide. The tool has been developed by Riverbank.
We would need to also write the bindings for it, meaning probably integrating its source code within QGIS to be sure that the bindings and the source are at the same version.

Code injection

Handwritten code is similar but different.
https://doc.qt.io/qtforpython/shiboken6/typesystem_codeinjection.html
E.g. parameter with type const QString & (CPP) is available as QString * (SIP) and const QString & (PySide2).
This means that handwritten code cannot just be reused and must be revised.

Translations

There is no corresponding tool for pylupdate with Qt for Python (see #163 (comment) or https://bugreports.qt.io/browse/PYSIDE-1552).
Solution would be to either

  • rely on PyQt6
  • add support for Python in the lupdate Qt tool
  • build an ad-hoc tool using gettext which can mimic what lupdate is supposed to do

We think that building an ad-hoc tool is the best approach as we are not mixing solutions and avoid a risky development.
Some tools with similar functionality already exist (https://github.com/danhper/python-i18n).

API Documentation

The generated bindings do not contain API documentation from the C++ API. Using the typesystem generation tool, we need to translate them as we are currently doing for SIP bindings.

Further considerations

Community support

Gitter channel is quite active and helpful (https://gitter.im/PySide/pyside2)
We have got close support for a R&D Manager at The Qt Company who is responsible of the Qt-for-Python development.

Documentation

Documentation is well accessible, concepts are well explained.

Integration into QGIS code base

As demonstrated during our testings, the code base can live with the two systems in parallel, allowing a continuous integration of Qt-for-Python.

A complete switch to Qt-for-Python needs to be combined with the switch to Qt6 / QGIS 4.
While we could certainly offer a compatibility layer, asking plugin developers to switch at the same time than Qt6 sounds much more reasonable.
Also, the bindings are less complete under Qt5 (for instance QSignalSpy comes with Qt 6.1).

Opportunities and risks to switch or stick to current solution

While writing the QEP, we identified the following reasons to evaluate moving away from PyQt:

  • it's a solution developed by a single person (hitting bus factor)
  • it offers very limited contribution opportunities, poor community experience (no issue tracker, no road-map)
  • we have a history of issues which either took a lot of time/energy to get solved or were never solved
  • Riverbank did not answered positively to 2 requests to work on issues for QGIS.org
  • Qt3D and QtChart are distinct products leading to more solutions to maintain and distribute (mainly for Windows, Android, MacOS and iOS).

Qt-for-Python might offer a better community solution, better integration with Qt and a more future proof solution.

Recent discussions with the future of Qt regarding open-source shall also be taken into consideration.
But switching to Qt-for-Python is closely tied to migrating to Qt6. If the path of Qt6 remains, switching to Qt-for-Python should be safe.

The impact on plugin authors is obviously a matter of consideration.
We would recommend using the compatibility layer instead of directly importing PyQt or PySide: qgis.PyQt (or qgis.Qt) would import the proper bindings depending on the environment. This means replacing the imports in the plugin code.
Additional changes will be required for plugins case dependant but not complicated.
Common exaples would be QVariant usages or exceptions instead of tuple return types.
Other changes such as enum being fully qualified will be required in any case (with PyQt6 too).
And QGIS4 will probably brings some small API changes at the same time.

Rough estimation of the work load for a switch

  • Writing the generation tool ~20 days
  • Merge existing core part to generate the bindings ~2 days
  • Make existing headers compliant with both sipify and the new generation tool ~5 days
  • Migrate handwritten code blocks ~8 days
  • Handle QScintilla bindings ~5 days
  • Implement the compatibility layer (NULL/None, QVariant, etc,) ~3 days
  • Tests fixing ~5 days

Total: ~48 days

What's next ?

To move on, we recommend the following approach:

  1. The present report is published and feedback is collected (probably on the mailing list or in a Github issue)
  2. PSC calls/nominates a technical committee of 3-6 relevant and interested developers to take a formal technical recommendation and confirm the risks and costs estimates.
  3. PSC validates or rejects the technical recommendation.
  4. If the switch to Qt-for-Python is validated, development should start as soon as possible and shared among several developers.

N.B.: Chances are high that people involved in the committee would also be developers participating to the migration, which is obviously a risk of neutrality. Integrating several developers from different companies should mitigate this risk.

Thank you for the research, I'm just brainstorming here but I was wondering if there is a chance to use libclang [1] to parse the C++ headers and create the XML for the bindings, of course if we don't find anything to get started.

[1] https://shaharmike.com/cpp/libclang/

About QVariant::Type (lack of) I was wondering if we can live without them or not: we have some places in the API where telling (for example) a QVariant::Type::Int from a QVariant::Type::ULongLong is important (e.g. QgsField but there are more), AFAIK there is no way to get these fine-grained types apart in Python.

Fantastic work @3nids, thanks for the in-depth review!

There's a lot of net positives here, but also a lot that greatly concerns me. We'll certainly need to proceed with extreme caution given how central Python is to the whole QGIS ecosystem!

One issue/question I have is this:

If the switch to Qt-for-Python is validated, development should start as soon as possible and shared among several developers.

I don't think we should proceed ASAP on this, regardless of the long-term decision. In my view we are:

  • technically ~1 year from Qt6 builds being a viable option (given Qt upstream is still catching up and adding missing functionality, the lack of Debian and other distro packages, the work involved in updating the Windows/Mac build environments, etc),
  • still needing ~2 weeks of porting effort just to get QGIS built on Qt6 (including app/gui/server and providers, not including 3d).
  • Then I'd estimate another 3-4 weeks of effort to get all the unit tests passing again on Qt6 builds.

And politically, I think we are at least 2 years away from being able to realistically consider a QGIS 4.0 major API breaking release. We are just too close still to the 3.0 work and I don't think it's a good move to force another breaking update of plugins on our users just yet. (Keeping in mind that I am supportive of a "very soft break" and allowing later 3.x releases to be built using Qt6 - see #198 for further details).

Suffice to say I don't think we should move forward on this yet, but we should plan to include it in the eventual 4.x transition (in >= 2 years time)

there is a chance to use libclang [1] to parse the C++ headers and create the XML for the bindings

The tool listed in the report is indeed based on clang, see https://code.qt.io/cgit/pyside/pyside-setup.git/tree/sources/shiboken6/tests/dumpcodemodel/main.cpp
This is definitely the best approach and should be...cleaner...than sipify.

Another question: will shiboken be able to handle smart pointers nicely?

About QVariant::Type (lack of) I was wondering if we can live without them or not: we have some places in the API where telling (for example) a QVariant::Type::Int from a QVariant::Type::ULongLong is important (e.g. QgsField but there are more), AFAIK there is no way to get these fine-grained types apart in Python.

There is no distinction in Python 3 between int and long int (neither for QgsField type). Can you point an example when you think this might be problematic?

@nyalldawson

Timeline sounds reasonable. I guess the decision will already take a few months, and I believe it's always the sooner the better. We can live with the 2 systems in parallel and we can run tests on the CI. That would leave us time and avoid too much pressure (if you consider branching of QGIS 4 would happen in 1 - 1.5 years from now).

But anyway, it would be up to the technical committee to propose a roadmap and timeline.

Yeah, bad example probably: I was trying to draw attention to signed vs unsigned.

There are a few places where these two types have different behavior in QGIS, one that comes up to my mind is the input widgets (for instance QgsDoubleSpinBox, when used for integers but the problem would be the same if we used a specialized integer widget) where the acceptable range is set according to the data type (for instance do not accept negative values for unsigned).

See: https://github.com/qgis/QGIS/pull/45587/files#diff-1cc6eaa68caf98bbc6d05d25535d5109e939f3a87cbcbe4f15708027dcde7413R543

Anyway, the general issue is that some data providers support different integer data types (signed, unsigned, long, short etc.) and we need to change the UI accordingly when accepting and validating input and/or when converting data between different fields.

I don't know at this point if this would be a real issue (Qvariant can do the conversions in C++ code if it needs to) but I believe this is something to keep in mind.

@3nids

(if you consider branching of QGIS 4 would happen in 1 - 1.5 years from now).

Well, just to clarify, I said "we are at least 2 years away from being able to realistically consider a QGIS 4.0"... so that was supposed to be a minimum amount of time, not my anticipated timeline of when to start.

Another question: will shiboken be able to handle smart pointers nicely?

I just did a quick research now. Shared pointers (QSharedPointer) are there, see https://bugreports.qt.io/browse/PYSIDE-454
This is untested on our side.

Most references to std::shared_ptr found via google lead to experiments, I wouldn't count on that being usable.

Nothing could be found for std::unique_ptr, this topic would require some more research regarding expected behavior (pointer invalidation, move semantics etc.) in combination with python's memory management if we think it's worth it.

@elpaso I believe we won't have much troubles to tackle this, I have raised some questions in the PySide matrix room regarding this: https://matrix.to/#/!yJmsruvrWPogyYtiLF:matrix.org/$163535383620096JrROk:matrix.org?via=matrix.org

some further ideas:

  • for QScintilla, the only current usage in QGIS is the console. We could migrate the console to C++ to drop the requirement to have bindings for QScintilla.
  • for NULL/invalid, we propose to use None for invalid and create a QgsUnsetValue for NULLs (maybe a separate work from Nyall)

to check:

  • is date time variant converted to QDateTime or native Python's date time?
  • ownership issue: what happens when calling QgsGeometry( otherGeom.get() ) (without clone())

I wanted to test PySide and so continued this proposal investigation. I tried to generate the typesystem files using the example provided in pyside repository.

Everything is here, it's a draft work, full of dirty hacks.

I focused only on qgis core, trying to build a Python wrapper based only on what's in QGIS headers and sip files.

So far, I managed to build the binding on a subset of classes but it can actually build the bindings of approximatively 90% of qgis core classes. Regarding the remaining 10%, there is about 20 different types of errors that would require more investigation (probably some are connected or the same issue)

The typesystem generator is built as a clang parser plugin and use the shiboken extractor API in order to retrieve information from class, methods, enum [...] in order to generate correctly the typesystem file. Some informations cannot be parsed with clang (SIP_NO_FILE, SIP_RUN, %MethodCode...) and need to be extracted before clang parsing (or from sip files for MethodCode).

I also tried to convert SIP specific injected code (%MethodCode) to Shiboken one with a bunch of regular expressions. It's ugly, it builds and it looks like it "works" for the few I have tested. The generator manage to convert automatically about 150 methods on 220 total. I think it would be possible to migred the most of it, but some would probably required to be manually re-written.

However, I tested poorly the generated binding, best would be to launch CI test to see if it really works.

TL;DR If at some point we choose to switch from PyQt to PySide, I think it could be done. It represents I think around 30 days of work (without mentionning packaging). I don't know if PySide is a better solution than PyQt but during my experimentation, PySide has proven to be well documented, easy to debug and understand what's going on, and has been a better developer experience than SIP.

wow, that's an impressive work!

I continued working on that matter thanks to special budget from QGIS.org. I managed to setup a functionning CI and run the tests.

I'm still focusing on qgis core, and (almost) everything is automatically generated from header and SIP files.

Here are few numbers:

  • 1091 classes have been successfully been wrapped (86%), 172 have still issues
  • But there are actually 955 shiboken warning messages. There are not all problematic but some are (related to not wrapped methods because of missing types for instance)
  • There are 161 on 276 (58%) SIP MethodCode automatically converted to Shiboken code. It builds and seems to work for the very few I tested.
  • There are 43 passing tests on 351 (12%)

So far, I'm still confident that the migration could be done and didn't find any limitations that would completely discard the use of Qt-for-python for QGIS bindings.

I still have one concern regarding packaging. As I'm writing this message, there are still no Shiboken/PySide packages for Qt6 in either Debian or Fedora, even in bleeding edge distributions (unstable, rawhide), and I don't know if there is plan to have some soon.

The only way to install it is using pip, but that can lead to issues when mixing qt shipped by pip and the qt from the system (even if the version match). It works on Debian, but not on Fedora. For now, it's better to manually build qt-for-python. I have no idea how is the situation on Windows.

Thank you @troopa81 great work.

Do you have an idea of the order of magnitude that will be required from plugin authors?

I still have one concern regarding packaging. As I'm writing this message, there are still no Shiboken/PySide packages for Qt6 in either Debian or Fedora, even in bleeding edge distributions (unstable, rawhide), and I don't know if there is plan to have some soon.

The only way to install it is using pip, but that can lead to issues when mixing qt shipped by pip and the qt from the system (even if the version match). It works on Debian, but not on Fedora. For now, it's better to manually build qt-for-python. I have no idea how is the situation on Windows.

I am currently doing some tests building with vcpkg (pyqt based), that could help to mitigate things on this front in the near future -- (especially if you don't want to vendor yet another package).

Do you have an idea of the order of magnitude that will be required from plugin authors?

It would depend of what we choose to mock or not. For instance, with PySide6 QVariant has been completely remove (while it still exist in PyQt6 but still requires some adjustment), but we could still mock it completely so it becomes transparent for plugin authors. Same goes for sip.isdeleted

On the other side, there are more complicated things to fix, like in PyQt6 IIRC, there is no more automatic conversion from int to enum when you pass it to a function. This one is maybe not that easy to mock... I don't know how it behaves yet in PySide.

And in any case, you would have to migrate the Qt5 deprecated code, like QRegExp for instance.

I am currently doing some tests building with vcpkg (pyqt based), that could help to mitigate things on this front in the near future -- (especially if you don't want to vendor yet another package).

Vcpkg seems promissing and I plan to test it some day. For now, it seems that there is no defined package for either PyQt/PySide, so I imagine that would require us to push one, like in debian/fedora...

Do you have an idea of the order of magnitude that will be required from plugin authors?

It would depend of what we choose to mock or not. For instance, with PySide6 QVariant has been completely remove (while it still exist in PyQt6 but still requires some adjustment), but we could still mock it completely so it becomes transparent for plugin authors. Same goes for sip.isdeleted

On the other side, there are more complicated things to fix, like in PyQt6 IIRC, there is no more automatic conversion from int to enum when you pass it to a function. This one is maybe not that easy to mock... I don't know how it behaves yet in PySide.

Interesting, thanks. Soit's even clearer -- if done -- we best bundle this migration with the move to Qt6.

And in any case, you would have to migrate the Qt5 deprecated code, like QRegExp for instance.

I am currently doing some tests building with vcpkg (pyqt based), that could help to mitigate things on this front in the near future -- (especially if you don't want to vendor yet another package).

Vcpkg seems promissing and I plan to test it some day. For now, it seems that there is no defined package for either PyQt/PySide, so I imagine that would require us to push one, like in debian/fedora...

Fyi, there's a draft here: https://github.com/m-kuhn/vcpkg/tree/pyqt5/ports/pyqt5 which can be added to a local overlay

Interesting, thanks. Soit's even clearer -- if done -- we best bundle this migration with the move to Qt6.

Yes. Last Qt6 meeting, we talked about having a "QGIS on Qt6 preview non-official-3.XX-version" that could first rely on PyQt6 and then, if we choose to move, to Qt-for-python BEFORE actually releasing QGIS4 on Qt6.