False positive on docstring

Question

False positive on docstring

matthiasschaub opened this issue 2 years ago · comments

Matthias (~talfus-laddus) commented 2 years ago

If flake8 is run on a file containing following code, errors listed below will get raised

def test():
    """
    Request osmcha analysis for changeset ids and update edits table with flags.

    All cs ids are passed as a parameter to the api request (in case of a large
    number of changesets,they will be split into multiple lists, i.e. this will
    result in multiple requests).
    The flags of each changeset are aggregated and uploaded into the db.

    To see possible values of reasons look at osmcha_reasons.md.
    :param settings:
    :return:
    """
    pass

test.py:3:80: E501 line too long (80 > 79 characters)
test.py:13:5: Q441 name Request is not valid, must be snake_case, and cannot end with `_`
test.py:13:5: Q440 keyword for is not uppercase
test.py:13:5: Q440 keyword and is not uppercase
test.py:13:5: Q440 keyword update is not uppercase
test.py:13:5: Q440 keyword table is not uppercase
test.py:13:5: Q440 keyword with is not uppercase
test.py:13:5: Q440 keyword All is not uppercase
test.py:13:5: Q440 keyword are is not uppercase
test.py:13:5: Q440 keyword as is not uppercase
test.py:13:5: Q440 keyword parameter is not uppercase
test.py:13:5: Q440 keyword to is not uppercase
test.py:13:5: Q440 keyword case is not uppercase
test.py:13:5: Q440 keyword of is not uppercase
test.py:13:5: Q440 keyword large is not uppercase
test.py:13:5: Q440 keyword of is not uppercase
test.py:13:5: Q440 keyword split is not uppercase
test.py:13:5: Q440 keyword into is not uppercase
test.py:13:5: Q440 keyword lists is not uppercase
test.py:13:5: Q440 keyword result is not uppercase
test.py:13:5: Q441 name The is not valid, must be snake_case, and cannot end with `_`
test.py:13:5: Q440 keyword of is not uppercase
test.py:13:5: Q440 keyword each is not uppercase
test.py:13:5: Q440 keyword are is not uppercase
test.py:13:5: Q440 keyword and is not uppercase
test.py:13:5: Q440 keyword into is not uppercase
test.py:13:5: Q440 keyword To is not uppercase
test.py:13:5: Q440 keyword values is not uppercase
test.py:13:5: Q440 keyword of is not uppercase
test.py:13:5: Q440 keyword at is not uppercase
test.py:13:5: Q444 incorrect whitespace around equals
test.py:13:5: Q443 incorrect whitespace around comma
test.py:13:5: Q445 missing linespace between root_keywords and and update
test.py:13:5: Q449 token All should be aligned to the right of the river
test.py:13:5: Q449 token cs should be aligned to the right of the river
test.py:13:5: Q449 token ids should be aligned to the right of the river
test.py:13:5: Q449 token are should be aligned to the right of the river
test.py:13:5: Q449 token passed should be aligned to the right of the river
test.py:13:5: Q449 token as should be aligned to the right of the river
test.py:13:5: Q449 token a should be aligned to the right of the river
test.py:13:5: Q449 token parameter should be aligned to the right of the river
test.py:13:5: Q449 token to should be aligned to the right of the river
test.py:13:5: Q449 token the should be aligned to the right of the river
test.py:13:5: Q449 token api should be aligned to the right of the river
test.py:13:5: Q449 token number should be aligned to the right of the river
test.py:13:5: Q449 token of should be aligned to the right of the river
test.py:13:5: Q449 token changesets should be aligned to the right of the river
test.py:13:5: Q449 token , should be aligned to the right of the river
test.py:13:5: Q449 token they should be aligned to the right of the river
test.py:13:5: Q449 token will should be aligned to the right of the river
test.py:13:5: Q449 token be should be aligned to the right of the river
test.py:13:5: Q449 token split should be aligned to the right of the river
test.py:13:5: Q447 root_keywords and and into are not right aligned
test.py:13:5: Q449 token multiple should be aligned to the right of the river
test.py:13:5: Q449 token result should be aligned to the right of the river
test.py:13:5: Q449 token in should be aligned to the right of the river
test.py:13:5: Q449 token multiple should be aligned to the right of the river
test.py:13:5: Q449 token requests should be aligned to the right of the river
test.py:13:5: Q449 token ) should be aligned to the right of the river
test.py:13:5: Q449 token . should be aligned to the right of the river
test.py:13:5: Q449 token The should be aligned to the right of the river
test.py:13:5: Q449 token flags should be aligned to the right of the river
test.py:13:5: Q449 token of should be aligned to the right of the river
test.py:13:5: Q449 token each should be aligned to the right of the river
test.py:13:5: Q449 token changeset should be aligned to the right of the river
test.py:13:5: Q449 token are should be aligned to the right of the river
test.py:13:5: Q449 token aggregated should be aligned to the right of the river
test.py:13:5: Q447 root_keywords and and and are not right aligned
test.py:13:5: Q447 root_keywords and and into are not right aligned
test.py:13:5: Q449 token To should be aligned to the right of the river
test.py:13:5: Q449 token see should be aligned to the right of the river
test.py:13:5: Q449 token possible should be aligned to the right of the river
test.py:13:5: Q447 root_keywords and and values are not right aligned
test.py:13:5: Q449 token of should be aligned to the right of the river
test.py:13:5: Q449 token reasons should be aligned to the right of the river
test.py:13:5: Q449 token look should be aligned to the right of the river
test.py:13:5: Q449 token at should be aligned to the right of the river
test.py:13:5: Q449 token osmcha_reasons should be aligned to the right of the river
test.py:13:5: Q449 token :param should be aligned to the right of the river
test.py:13:5: Q449 token settings should be aligned to the right of the river
test.py:13:5: Q449 token : should be aligned to the right of the river
test.py:13:5: Q449 token :return should be aligned to the right of the river
test.py:13:5: Q449 token : should be aligned to the right of the river

Python version and installed depdencies (venv) are:

python = "3.10"
flake8 = "4.0.1"
flake8-SQL = "0.4.1"

Moritz Schott · Answer 1 · Thu Mar 17 2022 22:50:28 GMT+0800 (China Standard Time)

for context: the following works:

def test():
    """
    Request osmcha analysis for changeset ids and update edits table with flags.

    All cs ids are passed as a parameter to the api request (in case of a large
    number of changesets,they will be split into multiple lists, i.e. this will
    result in multiple requests).
    The flags of each cs are aggregated and uploaded into the db.

    To see possible values of reasons look at osmcha_reasons.md.
    :param settings:
    :return:
    """
    pass

where The flags of each **changeset** are aggregated... has been changed.

Christian Riedel · Answer 2 · Sat May 21 2022 06:48:24 GMT+0800 (China Standard Time)

This is also a false positive:

def update_file_list(self) -> None:
    """Update file path list with paths specified on initialization.
     Clear the current file list. Then get the file and directory paths specified with
    :py:attr:`self.check_paths` attribute set on initialization and search them for rst files
    to check. Add those files to the file list.
    """

src/rstcheck/runner.py:66:9: Q440 keyword Update is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword file is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword path is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword with is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword on is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword current is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword file is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword Then is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword get is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword file is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword and is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword directory is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword with is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword set is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword on is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword and is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword search is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword for is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword to is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword Add is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword to is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q440 keyword file is not uppercase [flake8-sql]
src/rstcheck/runner.py:66:9: Q441 name Clear is not valid, must be snake_case, and cannot end with `_` [flake8-sql]
src/rstcheck/runner.py:66:9: Q447 root_keywords Update and and are not right aligned [flake8-sql]
src/rstcheck/runner.py:66:9: Q447 root_keywords Update and set are not right aligned [flake8-sql]
src/rstcheck/runner.py:66:9: Q447 root_keywords Update and and are not right aligned [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token Clear should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token the should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token :py should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token : should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token attr should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token : should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token `self.check_paths` should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token to should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token check should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token . should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token Add should be aligned to the right of the river [flake8-sql]
src/rstcheck/runner.py:66:9: Q449 token those should be aligned to the right of the river [flake8-sql]

John Pocock · Answer 3 · Fri Jul 08 2022 22:01:31 GMT+0800 (China Standard Time)

I have also encountered this issue. It is due to the regex used to detect query strings. Here are examples of false positive matches with the above docstrings: https://regex101.com/r/gYdthu/2 (see unit tests on the left)

John Pocock · Answer 4 · Fri Jul 08 2022 22:47:47 GMT+0800 (China Standard Time)

Some ways I can think of to avoid false positives like these for docstrings are:

Use a proper parser (e.g. with pyparsing) to check if a string is a valid SQL statement.
Use a hideously complex regular expression to check for comma separated values between SQL keywords etc. with unit tests to check cases.
Check that a string AST node is an argument to a function or assigned to a variable.

One simple regex change which could mitigate the problem in the meantime is to enforce that the string must start with whitespace and then an SQL keyword e.g. ^\s*insert.... Docstrings will still have to avoid starting with 'insert' or 'update' etc.

A minimally less complex regex which avoids some common false positives:

^\s*(select\s.*from\s|delete\s+from\s|insert\s+into\s.*values\s|update\s.*set\s[^=]+=)