rust-secure-code / safety-dance

Auditing crates for unsafe code which can be safely replaced

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Audit url

evanjs opened this issue · comments

url: GitHub, crates.io

cargo-geiger reports numerous usages of unsafe, though only 6 seem to be local to the url crate itself.

Example unsafe usage in decode_utf8_lossy

A majority of the remaining unsafe usages consist of, e.g. unicode-normalization and smallvec (#9)
Absolutely no idea if any of these unsafe cases are appropriate or not.

The url crate seems to be widely used enough to check, at least.

Just looked at the example unsafe block you linked. I think it is safe, and potentially appropriate, but I believe there is a way it can be re-written in safe code. I'll review the rest tomorrow and submit a PR if needed.

Submitted the PR - servo/rust-url#560

The comments look very nice and clear to me. Thanks for doing this!

I think one more unsafe block could be trivially rewritten as safe code, I've left a comment on the PR.

This leaves only two iterator implementations that return Some(&str) and I think those can be converted to returning Some(char) instead to avoid the unsafe conversions, with something like this:

mystring.chars().flat_map(|c| if should_percent_encode(c) {percent_encode(c).chars()} else {c})

@Shnatsel thanks for the feedback. I've removed the unsafe block as you suggested.

This leaves only two iterator implementations that return Some(&str) and I think those can be converted to returning Some(char) instead to avoid the unsafe conversions, with something like this..

If I'm understanding correctly, these would be changes which change the public interface?

Those structs seem to be private, so I think that wouldn't be a breaking change.

https://github.com/servo/rust-url/blob/master/src/form_urlencoded.rs#L125

This is one example of an iterator with an unsafe block. But ByteSerialize is marked as public in the form_urlencoded module, then src/lib.rs has pub mod form_urlencoded. I believe the other case of this is similar. Are you referring to some other structs? Or am I misinterpreting the Rust public/private rules?

Nah, it's probably me who misinterpreted the rules. Those structs indeed seem to be public, so changing the return type would indeed be a breaking change. Apologies for the confusion.

In addition, I figured we might want to check idna and percent_encoding as well, but wasn't sure how those should be handled.

i.e., I'm assuming we're doing a PR per project (e.g. rust-url/url, rust-url/idna), vs a PR per repo (e.g. rust-url/*)?

PRs are easier per project. Tracking issues... dunno, per repo probably, because that's what I'd grep for unsafe

For what its worth, my PR covers the full rust-url repo (but not dependencies).

For what its worth, my PR covers the full rust-url repo (but not dependencies).

Yup. Just looked through it a minute ago and realized it was comprehensive.

Regardless, I'll keep this in mind moving forward, and include checklists for bigger projects in case we need to break things up.

Thanks @Shnatsel and @JoshMcguigan!