utkarshkukreti / select.rs

A Rust library to extract useful data from HTML documents, suitable for web scraping.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CSS Selectors

utkarshkukreti opened this issue · comments

Although the current Predicate trait is more powerful (at least in theory) than CSS3 Selectors, it is also more verbose in cases where an equivalent CSS3 selector exists, and an extra hurdle for the users of this library who are already familiar with CSS selectors.

https://github.com/servo/rust-selectors looks like a nice library for this.

I can see 2 ways to implement this:

  1. Implement Predicate for the Selector struct in the selectors crate. This will require users to parse a string to a selector themselves before passing them to document.find().

  2. Implement Predicate for str. This means we either have to panic on invalid selectors or silently ignore them.

Another question is whether CSS Selectors should be an optional feature, and if so, whether it should be enabled by default or not.

Oh I should have seen this issue before I opened #29. I also think Predicate trait is more powerful, but CSS selector is needed for universal selecting gramma and shorter code.

From my point of view, the second way you suggested may be a better solution since the first way seems to lose the simplicity of CSS selector. The problem is similar with regex if we use str to represent CSS selector. We can either unwrap and let it panic, or use compiler plugin and compile the selecting str at compile time(needs nightly compiler and is slow). I think it is quite proper to panic on invalid selectors, just like compile error, and using unwrap is so common in rapid development.

I think it should be enabled by default.(just my personal preference...)