qgis / QGIS-Enhancement-Proposals

QEP's (QGIS Enhancement Proposals) are used in the process of creating and discussing new enhancements for QGIS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use QgsExpression to set a compiled subset string

elpaso opened this issue · comments

QGIS Enhancement: Use QgsExpression to set a compiled subset string

Date 2022/03/30

Author Alessandro Pasotti (@elpaso)

Contact elpaso at itopen dot it

maintainer @elpaso

Version QGIS 3.26

Summary

Currently, a provider filter (aka subset string) can be set directly using the SQL editor.

It would be desirable to store in a QgsVectorLayer an expression (a QgsExpression) that can be pre-processed through the provider's QgsSqlExpressionCompiler in order to create a subset string with variable substitutions before the layer is created.

The use case is for a "template" project with many layers that needs to be filtered using variables, for example a project with a @region variable that automatically constructs all layers filters based on the value of @region.

One possible approach would be adding full support to QgsExpressions directly in the provider but this would bring several problems given the fact that only a limited set of QgsExpression functions and tokens can be actually compiled into provider filters, this could lead to very bad performances when the filtering needs to happen on the client in case of Partial or Fail compilation.

Hereby we propose a simpler approach where the provider code is not changed and the pre-processing of the filter expressions is done in client code (QgsMapLayer) that ultimately sets the subset string only if the expression can be fully compiled into a provider native filter.

The application will take care of resetting the filter expression and after a successful compilation set the subset string when a referenced variable changes.

The functionality will be exposed in the GUI adding an expression editor in the subset string widget that allows to optionally set the expression that will be used to generate the subset string.

Proposed Solution

The solution requires that QgsSqlExpressionCompiler is able to evaluate variables (var() function), in order to do that we will pass a QgsExpressionContext to the compile methods, note that this would be desireable in any case, even if the rest of this proposal is not implemented.

    /**
     * Compiles an expression node and returns the result of the compilation.
     * \param node expression node to compile
     * \param str string representing compiled node should be stored in this parameter
     * \returns result of node compilation
     * \deprecated since QGIS 3.26, use the version with QgsExpressionContext instead.
     */
    Q_DECL_DEPRECATED virtual Result compileNode( const QgsExpressionNode *node, QString &str );

    /**
     * Compiles an expression node and returns the result of the compilation.
     * \param node expression node to compile
     * \param context expression context
     * \param result string representing compiled node should be stored in this parameter
     * \returns result of node compilation
     * \since QGIS 3.26
     */
    virtual Result compileNode( const QgsExpressionNode *node, const QgsExpressionContext *context, QString &result );

in QgsDataProvider we will add a method to compile an expression:

    /**
     * Compile an expression using provider's SQL compiler capabilities.
     * Must be implemented in the dataprovider, the default implementation returns Fail.
     * \param expression The expression text.
     * \param results It will contain a (possibly empty) compiled subset string.
     * \param context The expression context.
     * \returns the QgsSqlExpressionCompiler::Result of the compilation.
     * \note not available in Python bindings
     * \since QGIS 3.26
     */
    virtual QgsSqlExpressionCompiler::Result compileExpression( const QString &expression, const QgsExpressionContext *context, QString &result ) const SIP_SKIP;

in QgsVectorLayer we add a method to set the subset string from an expression:

    /**
     * Sets the expression used to define a subset of the layer, if the expression can be compiled
     * successfully setSubsetString() is called with the result of the evaluated expression.
     * \param expression The expression that will be compiled and set as a subset string.
     * \param context The expression context.
     * \returns TRUE, when compiling the expression and setting the subset string was successful, FALSE otherwise
     * \since QGIS 3.26
     */
    virtual bool setSubsetExpression( const QString &expression, const QgsExpressionContext &context );

The rest of the logic will be in app or gui and will reset the subset string of the layer that have an expression-based subset string that contains any changed variable.

Example(s)

        context.lastScope().setVariable('myvariable', 100)
        self.assertTrue(self.vl.setSubsetExpression("cnt > @myvariable", context))
        self.assertEqual(self.vl.subsetString(), '("cnt" > 100)')       

Affected Files

Data providers and QgsMapLayer.

Performance Implications

None

Backwards Compatibility

None

Votes

(required)

I have strong reservations about this proposed approach, to the extent that I'm -1 to implementing in this way.

Here's a summary of my concerns:

First, keep in mind these constraints:

  • The expression compilers are completely designed around the principle of building a filter which gives us the smallest possible set of features from the provider which is guaranteed to include all features which match the filter. In other words, the converted sql will always err towards the side of including features even if it's at the expense of possibly much better filtering with some uncertainty of corner cases. To use an example -- we don't compile a function like transform(...) to ST_Transform on postgres, because we can't be 100% sure that the exact same coordinate operation will be used by ST_Transform as we'd use on the QGIS side with the transform function. Or even $area isn't compiled to ST_Area(...) (despite this likely being a huge optimisation in many situations) as the QGIS calculated $area (ellipsoidal areas with project defined ellipsoid settings) won't match exactly the results of ST_Area (planar area calculations). So by design the expression compilers are NOT giving us the closest match to the QGIS expression, but rather a worst-possible outcome conversion.
  • The SQL compilation is often very basic -- e.g. in the case of a base OGR layer, we are restricted to converting only very simple =/</>/<=/>= type comparisons.
  • Even for some supposedly simple comparison expressions like "field" = 'value', we don't (and can't) compile this directly for some providers (those which use case-insensitive comparisons on the backend, like SQL server). If we compile "field" = 'some string' for SQL server we only get back a Partial compilation result, and we HAVE to filter all the results on QGIS side too to ensure that the comparisons are case sensitive.

So, with these constraints in mind....

I really can't see how we can get a good user experience with the proposed approach. As you've (correctly) stated we can apply the QGIS expression as a provider filter "only if the expression can be fully compiled into a provider native filter". This will only be true for limited, and very unpredictable, filter strings. So taking the SQL server example a user will try to set a supposedly very simple filter of "field" = @some_variable, and this WON'T be accepted. The user will have no real way of knowing why this (simple) filter isn't accepted, and will likely waste a bunch of time trying to modify their expression before ultimately giving up in frustration.

Even if the user knew enough about QGIS internals and the backend provider to realise that the filter can't be used because of the difference in QGIS/SQL Server handling of case sensitivity (and that's a HUGE ask), that same expert user will likely be completely stumped on why a $area > @some_variable filter fails on Postgres, thinking that "postgres has ST_Area, so QGIS will be converting $area to ST_Area and it should work".

So let's say we try to address this issue by adding some detailed "explanation of compilation failure" handling to the expression compilation. And then we could show the user an error like "This QGIS expression can't be used as a filter string because SQL server doesn't support case-sensitive string comparisons". That's better, but still ultimately a frustration/dead end for the user... they'll leave thinking "well if it can't even support "field"='value', what's the point of this feature?".

In summary -- given that this approach will be usable in a very limited set of circumstances, I don't think it's appropriate for inclusion in QGIS core. The same approach could theoretically be exposed through a plugin which manages the expression compilation and updates when project/global variables change, so I'd say this is a better fit for a custom plugin instead.

@nyalldawson thanks for the feedback.

Are you also -1 on implementing var function compilation in the SQL compilers? I don't see a problem there.

Are you also -1 on implementing var function compilation in the SQL compilers?

You possibly don't need to do that -- if you prepare the expression first using the context, then it will automatically replace any var nodes in the expression with static values which it can, and then it uses those during the compilation. See qgis/QGIS#41494