rust-lang / regex

An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.

Home Page:https://docs.rs/regex

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DoS risk: small regex gives: panicked at 'tried to unwrap expr from HirFrame, got: Concat'

PaulGrandperrin opened this issue · comments

Hi @BurntSushi ,
Sorry to bother you :-) another one!

regex::Regex::new("(?i)(?i)*?");
thread 'main' panicked at 'tried to unwrap expr from HirFrame, got: Concat', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/regex-syntax-0.5.3/src/hir/translate.rs:203:18

playground

@robertswiecki : honggfuzz is really good! libFuzzer has been unable to find those even though I'm sure many people ran it for many hours/days on this code.

Interesting! Seems like we (libfuzzer, afl, honggfuzz) might be using complementary strategies after all.

I guess so. I'll try to help on the rust front to make it easier to fuzz with all three.

Thanks! This is another good one, and I thought I had a test case for this. I'll look into it, but won't be able to fix it until tomorrow.

Is it possible to batch these bugs? I wouldn't be surprised if you found more, and it'd make more sense to just throw as many as you can over the wall so I can fix them in one go.

libFuzzer has been unable to find those even though I'm sure many people ran it for many hours/days on this code.

To be clear, the regex crate got a rewrite of its regex-syntax crate a few months ago. So if you're the first to run a fuzzer against that rewrite (I haven't), then you'll be the first to pick at the low hanging fruit. :-)

Is it possible to batch these bugs?

Yes, of course, it's just that I didn't know there was a recent rewrite so I didn't expect to find many bugs if any.

I ran the fuzzer for a couple of hours and it just found another one:

regex::Regex::new("(?i)?");
// thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:335:21

I think that's it for the most obvious ones. But I'll improve a bit the process and continue to fuzz for a small while.

Do you have any recommendation of APIs that might need more scrutiny than others?

Do you have any recommendation of APIs that might need more scrutiny than others?

Traditionally, fuzzers have found bugs by comparing the outputs of the different execution engines, but it requires dealing with internal APIs. There are a few open issues detailing bugs in that space.

With respect to the parser rewrite though, no, I think Regex::new is exactly the thing you want to test. The other piece is printing the error message returned by Regex::new when given an invalid pattern, but it looks like your setup already accounts for this. If you test those two things, then that should cover most or all of the rewrite.

I will let this issue sit for a few days before diving into bug fixing mode. :-)

Ok, I'll leave it to fuzz for a few days then.
Could you confirm that this code is enough to cover the code rewrite?

pub fn fuzz_regex(data: &[u8]) {
    if let Ok(data) = std::str::from_utf8(data) {
        let _ = regex::Regex::new(data);
    }
}

It seems to me that it's not mandatory to display the error (if any) because I didn't see any real logic in the Debug and Display implementation of regex::Error. I guess the new code printing the message is executed while constructing the regex::Error so if it panics, we'll see it.

I've also configured the fuzzer to generate input with a maximum size of 20bytes, does it looks like a good size to you?

Yup, that code looks good to me. And yes, the formatter is indeed run when the regex::Error is built.

20 bytes is probably good. At least, I can't think of any specific reason to go bigger.

You might also consider adding (?x) to the beginning of patterns, if that's easy to do. Alternatively, enable ignore_whitespace, which should accomplish the same thing. I bet that will uncover more bugs. :-)

Great, I'll do that.

Ok, I think you can go ahead and go into bug fixing mode whenever you want :-)
After almost 1 billion iterations, honggfuzz stopped being able to find new coverage.
I then tried libFuzzer, but it was unusable because it cannot continue fuzzing once it finds one crash.
And AFL was just too slow and wasn't able to deduplicate similar crashes so its output would have been unusable.
So it didn't find anything new and I'm stopping fuzzing.