Comparison to thirtyfour
bcpeinhardt opened this issue Β· comments
Hello!
I was just wondering what the general differences in approach and functionality were between this crate and the thirtyfour crate? I use thirtyfour for RPA and QA testing at work, and I saw that perseus had documentation for using fantoccini for web testing but not for swapping in thirtyfour, so I just thought it'd be good to ask if a contributor would be willing to sum up the general differences between the two.
Hi there! I actually had never heard about thirtyfour
until you just mentioned it. They certainly seem very similar, so arguably they don't both need to exist. I'm curious @stevepryde if you have a take on the differentiating factors, and maybe even if we could collaborate and try to consolidate into one crate instead?
Hi Jon, we actually did briefly discuss some differences a couple of years ago: https://www.reddit.com/r/rust/comments/evlc49/thirtyfour_a_new_selenium_library_for_rust_for/ but we decided it wasn't really clear how to merge them. I wanted to keep building out the API just to get a feel for what worked and what didn't.
However a lot has changed since then. I have occasionally checked back to see how things are going with fantoccini
and recently I've been trying to think about how the projects could be merged. Yesterday I worked on a major refactor which might actually open a path forward. The projects are slightly more similar in terms of how they're wired up. They certainly "feel" similar to use so the differences are mainly deep in the implementation and design.
Where fantoccini
implements a low level Future around hyper, thirtyfour
uses off-the-shelf http clients and allows plugging in different http clients via a trait. This was so that it could easily support both tokio and async-std. I'm not sure how much value that provides in practice, if any.
For the http client specifically, it should be easy to implement hyper as a http client for thirtyfour
, but not sure how much you "lose" from fantoccini
in doing that. I'd be open to moving the other way as well but would need to assess whether multiple http clients could still be supported or alternatively what is the cost of dropping async-std support.
The interface for thirtyfour
was patterned after the python selenium client. I actually prefer the shorter method names used in fantoccini
so happy to align with those if that's easier. It'll be a breaking change either way on both sides so we can discuss each one as we go.
I chose to put all of the high level functionality into a SessionHandle
struct and then share that everywhere, so that for example if you have a WebElement
you can still "run a JS script" from its methods in order to provide high level features like scroll_into_view()
. Having access to the full functionality of the underlying webdriver from anywhere that has a reference to the session is pretty powerful.
The actual WebDriver
is now just a thin wrapper that provides access to the session methods via Deref
. (Previously this functionality was exposed using async_trait
but that made the docs horrendous to read).
Another reason for passing around a reference to the session is that it then guarantees at compile time that you don't try to "use" the session after the browser is closed. I don't know if there are issues with this in practice and it does mean dealing with references and thus lifetimes, but so far it's ok.
I see that fantoccini
also now has functionality to wait for various conditions and also to customize the waiting behaviour. There is probably room to combine the best of both feature sets there.
The problem is that the differences are subtle and run deep, so it really does require some experimentation. I'd be keen to open an experimental branch on either project and start playing around with ideas. The road might be a long one but I think it's worthwhile in the end. I'm a pragmatist at heart so as long as we end up with something that provides a great user experience then it's a win.
I'm sure neither of us have unlimited time to work on this as well, but we have to start somewhere. And with enough community support from both sides hopefully this could move to a more official project with more than 1 maintainer. That would be cool.
@stevepryde You mentioned to me before that the interface was a port of the python bindings, but that doesn't extend to the query and wait interfaces does it? I have to say I've grown quite fond of the query builder, as it keeps locators simpler/reusable and feels natural to write, although I understand they currently require extra requests to the WebDriver server. I know there's the W3C WebDriver spec to stay in line with, but it's nice to see different APIs built on top of it, and I basically never type driver.find_element anymore. I bring this up because in my mind a shared interface which maps to either the thirtyfour or the fantoccini "base" APIs might be an easier place to start collaboration, and as a user it stands out to me as a difference as much as the choice of http client, although I don't work on exceptionally large testing suites.
Correct. I started out using the python bindings as a guide because I had some good experience with those and knew it was a good base to start from. The query interface was similar to something I had built previously on top of the python bindings although I had the chance to refine it a little more this time. I really like this extra layer of abstraction and find it really useful as well. Whether it needs to be in the same crate or a separate crate is up for discussion.
As for performance yes it does do extra requests but it's common to run tests with a webdriver bundled locally or at least on the same network so the requests should be fast.
That said, something that's on my wishlist is either an xpath "ORM"/builder or better yet a macro that can check xpath queries at compile time to ensure they are syntactically valid.
I've had a deeper look into the code. Here's a summary:
Fantoccini pros:
- Error handling seems much nicer in
fantoccini
. I like the way everything is grouped into sections. - Lots of links to the official W3C spec in the docs. Very nice.
- Uses the
webdriver
crate (maintained by mozilla) types for commands etc. - Integration with the
cookie-rs
crate - Form handling
- Access to the inner HTTP client (may be useful for testing download links etc)
- Clean / clear API (actually the two crates are quite similar to use from an end-user perspective)
Features of thirtyfour
that are not (yet) in fantoccini
:
- Support for Chrome Devtools Protocol (this also includes selenium 4.x support)
- Action Chains
- Element query / waiter interfaces (the wait() method provides limited support)
- Browser extensions
- Slightly more features for
<Select>
elements - Better support for sending key combos to elements (e.g.
elem.send_keys(Keys::Control + "a").await?;
) - Specific helpers for setting up browser capabilities, e.g. enabling headless, adding Chrome options, setting debugger address etc
- Code examples in docs
- Tests!
thirtyfour
has doctests for most methods and these also serve as example code. Granted this is difficult and the approachthirtyfour
uses for tests definitely has its drawbacks, but it does work pretty well. - CI/CD? The link to Travis for
fantoccini
is currently broken. What about Github Actions?
WebDriver commands not yet supported by fantoccini
:
(apologies if any of these are actually supported, it's meant to read as a TODO list not a criticism)
- Status
- GetTimeouts
- SetTimeouts
- Forward
- GetTitle
- MaximizeWindow
- MinimizeWindow
- FullscreenWindow
- IsElementSelected
- IsElementDisplayed
- GetElementCssValue
- GetElementTagName
- GetElementRect
- IsElementEnabled
- AddCookie
- PerformActions
- ReleaseActions
- DismissAlert
- AcceptAlert
- GetAlertText
- SendAlertText
All of the above (both commands and features) should be relatively straightforward to add. There are discussions to be had regarding interface decisions and where things should live (even whether they belong in fantoccini or in another crate), but I don't see any technical blockers here. I'm happy to do it but also fine if anyone else wants to port stuff across. The two projects use the same licenses anyway.
Features of thirtyfour
that might be difficult to add to fantoccini
:
- Support for other HTTP clients. Need to assess whether this is actually an issue anyone cares about. It's worth noting this also implies support for other async runtimes. Fantoccini currently requires the tokio runtime.
- All elements and other structs that use the session (other than the main client itself) contain a reference to the session, so that Rust will verify at compile time that there are no uses of the session after the browser is closed
- Configuration stored in the client struct. The
ElementQuery
interface was originally added via a separate crate and I was able to easily extendWebDriver
through the use of a trait. However it needed a nice way to be able to set configuration defaults on the session so that every call toWebDriver::query()
could remember those defaults. It should be possible infantoccini
too but probably out-of-scope here.
Also note many of the missing features etc can be added piece by piece, which is much easier. No need for any big changes.
I notice @jonhoo s comment in the reddit thread,
As for merging, I think what we would end up doing is going with one design or the other, rather than trying to combine them, since as you say it's not clear how we'd really do that. One option short of that is to try to find a way to share a test suite. Abstract the tests over some common API that both crates can "provide". Might help uncover bugs that are in one but not the other, I'm not sure?
and @stevepryde s comment
That said, something that's on my wishlist is either an xpath "ORM"/builder or better yet a macro that can check xpath queries at compile time to ensure they are syntactically valid.
and can't help but think these tasks are suited to one another, as a builder that simply composed an xpath under the hood would be fairly straightforward to use in either crate. Certainly I think a macro for validating xpaths and/or css selectors at compile time would be useful to embed in the locator constructors for either crate, in fact I think I'll explore that a little and follow up if I get somewhere. But @jonhoo I'd be interested to hear your thoughts on what a common api would look like?
Thanks for driving this forward @bcpeinhardt, and thanks for the detailed thoughts @stevepryde! I had completely forgotten that Reddit thread π
I'll leave some thoughts in semi-random order.
HTTP client support: I don't think it's important for a WebDriver library to be generic over the mechanism it uses to interact with the WebDriver host β that should be an implementation detail the caller should not care about. Async Rust is in this slightly awkward spot at the moment where the choice of HTTP client also dictates the async runtime, which is definitely unfortunate, but I also haven't seen good solutions to this problem. So, my instinct for the time being is to stick with by far the most widely used HTTP client (hyper, which reqwest also uses under the hood), and not add the unnecessary extra complexity to the implementation and interface that being generic adds. My hope is that down the line it'll be easier to be runtime-agnostic, but right now I don't think we have a way to be that's worth the cost for a library like this one.
Handles by reference: While it's true that fantoccini doesn't have its handles (like element handles) take a reference to the original session, it does ensure that handles don't outlive the session simply by passing along the Client
itself into the handle. Essentially, the session isn't closed until all handles go away (reference counting essentially). You can see this in the existence of Element::client
for example. In fantoccini
, Client
is Clone
, which I think is an ergonomic upside compared to having handles require a living reference. But I could go either way on that tbh, and am not married to that approach.
Ergonomic accessors: fantoccini
is intended to be a low-level binding to WebDriver. My thinking (which I think remains the same) is that something that provide, say, scroll_into_view
or fancy query mechanisms beyond those of the standard, belong in a library that wraps fantoccini
, rather than in fantoccini
itself. Now, those two libraries could be co-owned and co-released, but I think having that separation is worthwhile. I almost wonder if thirtyfour
could switch to fantoccini
"under the hood" or something like it?
Maintenance burden: I'm finding myself with precious little spare time these days, which has definitely impacted my ability to responsibly maintain this project. I try to do what I can to stay on top of things, but I definitely would love help in maintaining fantoccini
/handing it over/merging it with another project. And I realize that with that I'd also be handing over some of the authority to decide what's "right" for the library β I'm fine with that :) I'll do what I can to help along any effort to move to one project (possibly with two crates in it as outlined above), but may not be able to invest much actual implementation time.
Ok, so @jonhoo, just a couple of questions:
- Would you be ok with limiting
fantoccini
strictly to the W3C spec then? - What about support for CDP (Chrome Devtools Protocol, which adds a lot of stuff now officially part of selenium 4)?
Repurposing thirtyfour
as a batteries-included layer on top of fantoccini
seems like a good path forward.
I'm ok with dropping support for other HTTP clients. I agree that in practice it really shouldn't matter.
Regarding the handles by reference thing, from what I can tell in fantoccini
if you get a reference to an element, then close the browser, then try to access that element, Rust would be fine but the webdriver (geckodriver or selenium etc) would throw an error at runtime. The difference with thirtyfour
is that Rust would catch this at compile time as an error.
What you might be referring to is the case where you lookup an element, and then your Client
goes out of scope, and then you try to access that element. In fantoccini
this works like you said because the "session" effectively lives until the last access of it. However this only works if you don't close the browser explicitly, and then the browser will remain open after your program exits.
In thirtyfour
it is assumed that the browser session lifetime is the same as the WebDriver
struct lifetime, and thus any access of the session after WebDriver
goes out of scope is considered an error. This is intentional, but I can see how this might not be always desirable in cases where the browser is never closed (you can still opt not to close the browser, but the WebDriver
struct must still stay alive until the end of the program). It would be just as easy for thirtyfour
to just clone the session struct everywhere (it's just a channel sender), and then you'd have the same behaviour as fantoccini
. I won't actually do this though, since thirtyfour
will eventually just use fantoccini
directly, as you suggested.
Btw both libraries cannot have the browser automatically close on Drop, due to there being no async destructors. This is unlikely to change anytime soon (if ever).
So my takeaway from here is that the path forward looks something like this:
- Implement remaining W3C functionality in
fantoccini
(can probably port a lot of it fromthirtyfour
) - Refactor
thirtyfour
to usefantoccini
under the hood, and provide higher-level methods on top
Sound like a plan?
- Yes π
- That's tricky. One the one hand, I feel like that should be a different crate altogether. But on the other, if it shares a lot of interface with WebDriver, maybe it makes sense to let
fantoccini
be an abstraction layer across both? I'm not familiar enough with CDP to say just how much would be shared between the two. If CDP is strictly more powerful than WebDriver, I think it might make most sense to have fantoccini support both, with CDP features only being provided for clients connected to a CDP session (enforced by the type system).
Ah, sorry, yes, for the explicit close case, you're right. Whether that's a feature or a shortcoming I guess depends on the kind of foot-gun you prefer to avoid. I can definitely see the argument for enforcing correct operation through lifetimes though, and would be okay with moving fantoccini in that direction π The original version of fantoccini predates async/await, where it was basically impossible to work with async interfaces that had lifetimes in them, so it's partially a leftover from those days. One consideration is that it's possible to build the lifetime-tracking interface on top of a reference-counted session, but not the other way around (I don't think). But maybe that's okay β maybe it's rare that anyone actually needs to be able to have a 'static
element handle.
I like that plan!
I'm only vaguely familiar with CDP at this point but have used a few bits of it in the past. For example, CDP gives you access to the network requests and I believe you can even use it to proxy requests. It does many things though and is arguably bigger than the WebDriver spec. The good news is that it follows the same kind of protocol so you'd basically have a separate Command
enum for CDP and use all of the same infrastructure for it. Some of it will require specialized interfaces for better ergonomics, but the low-level stuff is all the same as for WebDriver.
I too wondered if it could go in a separate crate. However moving forward I'd guess that users are either going to be targeting selenium (which now includes CDP, at least in part) or chromedriver/geckodriver (both of which support CDP - I believe it's still experimental in geckodriver but officially supported).
As such users won't really care too much about whether they're using the original W3C commands or the CDP commands - they'll want both supported out-of-the-box. So I think it makes sense to add CDP support to fantoccini
directly, and expand the scope to cover WebDriver + CDP.
Unfortunately CDP is not a W3C standard. It originated as a Chrome-specific thing (chromedriver was actually implemented on top of it). Puppeteer and Cypress use CDP directly. And now Selenium 4 uses it too, which means Firefox has had to implement it (I believe several Selenium devs are also Firefox devs).
It looks like the path forward after CDP will be called WebDriver BiDi: https://developer.chrome.com/blog/webdriver-bidi/
Just something to keep an eye on down the track. For now WebDriver + CDP is all there is.
Let's move discussion of xpath to a separate issue on the thirtyfour
side
Just a quick update. Fyi only. No action needed.
After a fair bit of hackery I've got (a branch of) thirtyfour
in a place where it uses fantoccini
underneath, but with lots of functionality missing. It's not as bad as it sounds. Mainly it's just missing wiring for a bunch of the WebDriverCommand
s.
I've also got a branch on my fantoccini
fork that adds all of the missing commands in the enum, but I haven't yet added public methods to use them. It's probably in a good spot for a PR though.
Everything currently compiles, but with lots of unimplemented()!
scattered around. However, it's enough to run some tests and it can at least start a session, find elements, type into them, and click, etc. A good first step I think.
Next step will be to add the public methods to fantoccini
in order to allow all of the commands to be called.
This transition will be completed in thirtyfour
v0.29.0, which will be released shortly after fantoccini v0.19.0 is released.
We're at 0.x anyway, so I just released 0.19.0! π
thirtyfour
v0.29.0 has been released. This issue can be closed π
FYI thirtyfour
v0.30.x will introduce major API changes (mainly method renames) to bring everything more in line with fantoccini
, such that using either one should feel almost identical. The existing method names are still there but marked as deprecated. This is in line with my goal to make thirtyfour
feel like "fantoccini plus extensions". This also goes some way to supporting a v1.0 for both projects in future, once these apis are deemed "stable". That gets complicated now that the WebDriver spec is being updated but we'll see how we go.
Amazing, thanks for the update!
Actually, @stevepryde, maybe it'd be a good idea to add a mention of thirtyfour to the fantoccini README?
That would be cool but it's up to you. I'd also like to add more of a description on the thirtyfour README for how the two crates interact. At some point I'd like to try implementing all the thirtyfour features as traits to extend fantoccini types. That would be pretty cool I think. I like the idea of using fantoccini then importing traits from thirtyfour for any extra bits someone wants to use.