Alphanumeric parameter only works with numbers
kriswuollett opened this issue · comments
What happened?
Tried using the playground to test out an alphanumeric parameter such as $employee_id
but it is not keeping the string unchanged in the generated SQL. Snippet from the docs saying that it should work:
prql/web/book/src/reference/syntax/parameters.md
Lines 3 to 9 in c50f5a7
Use case is generating a query for rusqlite
, so I also believe that there is also the database library dimension when it comes to parameters: it is not necessarily relevant that the database dialect itself supports any parameter type.
PRQL input
prql target:sql.sqlite
from employees
filter id == $employee_id
SQL output
SELECT
*
FROM
employees
WHERE
id = $ employee_id
-- Generated by PRQL compiler version:0.9.1 (https://prql-lang.org)
Expected SQL output
SELECT
*
FROM
employees
WHERE
id = $employee_id
-- Generated by PRQL compiler version:0.9.1 (https://prql-lang.org)
MVCE confirmation
- Minimal example
- New issue
Anything else?
No response
Caused by the SQL formatter. I don't think there is a non-hacky way to fix it, other than opening a PR to the upstream library.
@aljazerzen, I don't know if you would consider this hacky, but I think it may be possible to take advantage of the library's substitution parameters:
/// Formats whitespace in a SQL string to make it easier to read.
/// Optionally replaces parameter placeholders with `params`.
pub fn format(query: &str, params: &QueryParams, options: FormatOptions) -> String {
let tokens = tokenizer::tokenize(query);
formatter::format(&tokens, params, options)
}
I forked their repo to see if I could fix it, but think fixing it actually relates to the recognition of those substitution parameters and non-substitution parameters like :names
. Basically shouldn't it be an error if different variable types are mixed in the same query in both your library and theirs?
I think it may be possible to output a SQL template to be formatted after substitution, rather than assuming the input SQL is already final. To try it out, insert this test here:
#[test]
fn it_recognizes_question_numbered_placeholders_with_param_values_demo() {
let input = "SELECT * FROM things WHERE id = $1 LIMIT $2;";
let params = vec![":id".to_string(), ":count".to_string()];
let options = FormatOptions::default();
let expected = indoc!(
"
SELECT
*
FROM
things
WHERE
id = :id
LIMIT
:count;"
);
assert_eq!(
format(input, &QueryParams::Indexed(params), options),
expected
);
}
#[test]
fn it_recognizes_question_numbered_placeholders_with_param_values_demo_2() {
let input = "SELECT * FROM things WHERE id = $1 LIMIT $2;";
let params = vec!["$id".to_string(), "$count".to_string()];
let options = FormatOptions::default();
let expected = indoc!(
"
SELECT
*
FROM
things
WHERE
id = $id
LIMIT
$count;"
);
assert_eq!(
format(input, &QueryParams::Indexed(params), options),
expected
);
}
Perhaps this could even open up the opportunity for the selected SQL dialect to map to an appropriate placeholder type if things like alphunum aren't supported... and add code comments documenting what the numbered parameters map to name-wise.
@aljazerzen, I don't know if you would consider this hacky, but I think it may be possible to take advantage of the library's substitution parameters:
This is great, I hadn't seen this.
So maybe this could be as simple as something which searches the pre-format string for \$\w+
, replaces it with a $8001
(or whatever), and adds the variable to the query param map?
(ref #1284, which is a much harder problem, because we need to keep the s-string contents separately before replacing it in)
@aljazerzen, I don't know if you would consider this hacky, but I think it may be possible to take advantage of the library's substitution parameters:
This is great, I hadn't seen this.
So maybe this could be as simple as something which searches the pre-format string for
\$\w+
, replaces it with a$8001
(or whatever), and adds the variable to the query param map?
Yes, that is what I was guessing... \$\w+
token is / could be a logical variable in prql
, the database/api-dependent variable type gets passed through to sqlformat-rs
$n
indexed substitution parameter so it can appear as expected physical rendering in the output sql text. I see the entrypoint here:
prql/crates/prql-compiler/src/sql/mod.rs
Lines 28 to 39 in 57946ce
for sqlformat-rs
with default()
meaning no params used yet:
#[derive(Debug, Clone)]
pub enum QueryParams {
Named(Vec<(String, String)>),
Indexed(Vec<String>),
None,
}
impl Default for QueryParams {
fn default() -> Self {
QueryParams::None
}
}
Yes, that is what I was guessing...
\$\w+
token is / could be a logical variable inprql
, the database/api-dependent variable type gets passed through tosqlformat-rs
$n
indexed substitution parameter so it can appear as expected physical rendering in the output sql text.
Yes. I think we could start with a text replacement.
A fuller approach (one that could lead to #1284) would be to take it from the AST (if that's what you meant by "logical variable"...) at
prql/crates/prql-compiler/src/sql/mod.rs
Line 24 in 57946ce
Ohh, this is interesting and could work!
Text replacement was the hacky thing that I wanted to avoid, because it would fail when formatting this:
SELECT '$1' as normal_string, $1 as paraml;
But if we use the params, they should be parsed correctly. If I understand correctly, this is what we could do:
- find all params in RQ AST and replace them with positional params,
- compile the AST to SQL,
- format and pass the original params to be substituted back in.
The nice part is, that we can also extract s-string, guaranteeing that they are not formatted!
The nice part is, that we can also extract s-string, guaranteeing that they are not formatted!
Ah of course — I hadn't realize s-strings were still there in that function, but they are!
@aljazerzen do you think this is possible for @kriswuollett to work on / there's some initial work that they could do? Or too hard with the s-string issue? They scoped this out, and I know we're trying to encourage folks to do an initial PR.
Yes, the hard part will be extracting the params from RQ IR tree before that is compiled to SQL.
Something like this will be needed:
/// Extracts params (and potentially s-strings)
/// so they can be substituted back in after formatting.
struct ParamExtractor {
param_contents: Vec<String>,
}
impl ParamExtractor {
/// Takes a param content - an arbitrary string that we want to prevent from being formatted.
/// Returns a positional SQL param (i.e. $3), which will later be substituted for the content.
fn push_param(&mut self, param_content: String) -> String {
self.param_contents.push(param_content);
self.param_contents.len().to_string()
}
}
impl RqFold for ParamExtractor {
fn fold_expr_kind() {
... here we can match params and call `push_param()` with the param name
}
}
This can then be called from here: https://github.com/PRQL/prql/blob/ed9977d530ad6b0b649f9ac4e53d4678ede2f0a4/crates/prql-compiler/src/sql/mod.rs#L24C1-L24C1
RqFold
will provide ParamExtractor::fold_query()
which must be applied to query
before it is compiled.
@kriswuollett I'm not sure if you'd be up for starting this (no problem if not — the issue is appreciated regardless). If you would be, we'd be happy to help with any questions or guidance.
Hi @max-sixty, I'd love to but can't get to it at the moment as I'm just starting up on a new project and don't have the bandwidth quite yet.
Testing the OP example code on the playground:
prql target:sql.sqlite
from employees
filter id == $employee_id
results in
SELECT
*
FROM
employees
WHERE
id = $employee_id
-- Generated by PRQL compiler version:0.11.3 (https://prql-lang.org)
so it appears that this has been fixed
Ah nice, I guess sqlformat-rs
fixed it and we inherited the fix!