astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

Home Page:https://docs.astral.sh/ruff

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FALSE NEGATIVE] "S608 Possible SQL injection"

mpolyakov-plutoflume opened this issue · comments

Hello.
I'm very exited to migrate our codebase to ruff. However, while doing so I've noticed, that rule S608 works different from the corresponding B608.
It only triggers if SELECT is on the same line.
So SELECT * FROM {foo}.table is an error and

SELECT *
FROM {foo}.table

Is not.

Kind regards, Mikhail

  • B608
  • S608
  • sql injection

I just tried and the rule works as expected for triple-quoted f-strings. Any chance that you're using a line continuation token?

https://play.ruff.rs/1c1c9715-2f69-48e1-9e31-f2edcef405ac

Than you for a quick reply.
I've added a couple of examples to https://play.ruff.rs/d3adeb3f-9657-4314-8d99-a45d426d6674

Basically, it does not work, when SELECT is on a new line or SELECT and FROM are on different lines.

The rule seems to be sensitive to certain indentation. Here are a few examples where the rule is not triggered but it should be:

def sql():
    foo = "foo"

    f"""
SELECT {foo}
    FROM bar
    """
def sql():
    foo = "foo"

    f"""
    SELECT
        {foo}
    FROM bar
    """
def sql():
    foo = "foo"

    f"""
    SELECT {foo}
    FROM
        bar
    """

The rule seems to be sensitive to certain indentation.

Yeah, I think the regex is not considering the case where there could be "one or more whitespace characters":

static SQL_REGEX: Lazy<Regex> = Lazy::new(|| {
Regex::new(r"(?i)\b(select\s.+\sfrom\s|delete\s+from\s|(insert|replace)\s.+\svalues\s|update\s.+\sset\s)")
.unwrap()
});

So, the \s should be \s+ instead.

https://regex101.com/r/UPYDXC/2

@dhruvmanila do you plan to PR or should we mark this as a good first issue?

@dhruvmanila Shall I pick this issue?

@DataEnggNerd For sure! Thanks

Need help here.
I have added \s+ before from but it is failing with examples given at #12044 (comment) during compilation, but being detected at regex101.

@MichaReiser I need a clarity here. As per my understanding based on the documentation here, every SQL statement which have string formatting should be reported with S608. Did I get it right?

@DataEnggNerd Yes, the rule ideally flags all SQL statements with as few false positives as possible.

The change mentioned in this issue is that we should re-visit Ruff's regex to allow optional whitespace between the different keywords. It's a non-goal to change how the rule detects SQL expressions significantly.

As reference, the Regex used by bandit

https://github.com/PyCQA/bandit/blob/4ac55dfaaacf2083497831294b2dcd7e679f8428/bandit/plugins/injection_sql.py#L74-L80

@MichaReiser Yes, I have started with modifying the regex first and faced some problem with the regex. @dhruvmanila mentioned there is something missing beyond the regex(you can refer closer PR linked to the issue), so i started looking into this. Apologies it is making more time to implement at code.

@DataEnggNerd

Playing with it in Regex101, the following should work: (?i)\b(select\s+.*\s+from) https://regex101.com/r/T8ojD0/1

@MichaReiser With the changes you suggested, along with the regex from bandit, unfortunately both are not helping.
I have added details in the draft PR - #13445?
Help me out if I am missing something.