Failing to parse valid regex
fbs opened this issue · comments
The following regex fails to parse in rust, but works with the official go parser:
uri=~"/v[1-9]/.*/{gid}/{uid}"
Reproduce
use promql_parser::parser;
fn main() {
let promql = r#"rate(http_server_requests_seconds_bucket{method="GET",status=~"2..",uri=~"/v[1-9]/.*/{gid}/{uid}",le="0.25"}[1h])"#;
match parser::parse(promql) {
Ok(expr) => {
println!("Prettify:\n\n{}", expr.prettify());
println!("AST:\n{expr:?}");
}
Err(info) => println!("Err: {info:?}"),
}
}
# `Err: "illegal regex for /v[1-9]/.*/{gid}/{uid}"`
The same query works with in go:
package main
import (
"github.com/prometheus/prometheus/promql/parser"
)
func main() {
_, err := parser.ParseExpr(`rate(http_server_requests_seconds_bucket{method="GET",status=~"2..",uri=~"/v[1-9]/.*/{gid}/{uid}",le="0.25"}[1h])`)
if err != nil {
panic(err)
}
}
Maybe it's because the difference between regex implementation of Rust and Go. I will try to fix it.
Thanks for your report, and any PR is welcome.
hi @yuanbohan, wow faster answer than I expected <3. I was taking a quick look as well:
For reproducing:
use regex::Regex;
fn main() {
let re = r#"/v[1-9]/.*/{gid}/{uid}"#;
Regex::new(&re).unwrap();
}
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
/v[1-9]/.*/{gid}/{uid}
^
error: repetition quantifier expects a valid decimal
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)', src/main.rs:20:21
Which makes sense, not sure what a good fix is here
Maybe you can try other symbols in your uri, like <
, _
?
Or just escape the curly brackets using \
in your regex?
let re = r#"/v[1-9]/.*/\{gid\}/\{uid\}"#;
let cap = Regex::new(&re).unwrap().captures("/v1/foo/{gid}/{uid}");
println!("{cap:?}");
==== output
Some(Captures({0: 0..19/"/v1/foo/{gid}/{uid}"}))
Is this what you want? @fbs
For a bit of background (shouldve added that earlier):
We run a shared prometheus platform used by many teams. I want to do some analysis of those rules to help teams with some company specifics (architecture). As its an all python team I'm using this library as pyo3 makes it really easy.
So changing the regex isn't an option, they come from end users and are valid for prometheus itself.
I guess its hard to fix as its a regex implementation difference between go and rust. If there was a 'less strict' regex flag I would be ok with that. What do you think?
I'm open to implementing something for it
I guess its hard to fix as its a regex implementation difference between go and rust. If there was a 'less strict' regex flag I would be ok with that. What do you think?
From the regex lib's doc I can't find any configuration to control such behavior 🙁
Not a regex expert, but what do you think of adding some "preprocess" to modify the input regex rule like @yuanbohan said:
Maybe you can try other symbols in your uri, like
<
,_
? Or just escape the curly brackets using\
in your regex?
I'm not sure if this is viable...