jonhoo / fantoccini

A high-level API for programmatically interacting with web pages through WebDriver.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comparison to thirtyfour

bcpeinhardt opened this issue Β· comments

Hello!
I was just wondering what the general differences in approach and functionality were between this crate and the thirtyfour crate? I use thirtyfour for RPA and QA testing at work, and I saw that perseus had documentation for using fantoccini for web testing but not for swapping in thirtyfour, so I just thought it'd be good to ask if a contributor would be willing to sum up the general differences between the two.

Hi there! I actually had never heard about thirtyfour until you just mentioned it. They certainly seem very similar, so arguably they don't both need to exist. I'm curious @stevepryde if you have a take on the differentiating factors, and maybe even if we could collaborate and try to consolidate into one crate instead?

Hi Jon, we actually did briefly discuss some differences a couple of years ago: https://www.reddit.com/r/rust/comments/evlc49/thirtyfour_a_new_selenium_library_for_rust_for/ but we decided it wasn't really clear how to merge them. I wanted to keep building out the API just to get a feel for what worked and what didn't.

However a lot has changed since then. I have occasionally checked back to see how things are going with fantoccini and recently I've been trying to think about how the projects could be merged. Yesterday I worked on a major refactor which might actually open a path forward. The projects are slightly more similar in terms of how they're wired up. They certainly "feel" similar to use so the differences are mainly deep in the implementation and design.

Where fantoccini implements a low level Future around hyper, thirtyfour uses off-the-shelf http clients and allows plugging in different http clients via a trait. This was so that it could easily support both tokio and async-std. I'm not sure how much value that provides in practice, if any.

For the http client specifically, it should be easy to implement hyper as a http client for thirtyfour, but not sure how much you "lose" from fantoccini in doing that. I'd be open to moving the other way as well but would need to assess whether multiple http clients could still be supported or alternatively what is the cost of dropping async-std support.

The interface for thirtyfour was patterned after the python selenium client. I actually prefer the shorter method names used in fantoccini so happy to align with those if that's easier. It'll be a breaking change either way on both sides so we can discuss each one as we go.

I chose to put all of the high level functionality into a SessionHandle struct and then share that everywhere, so that for example if you have a WebElement you can still "run a JS script" from its methods in order to provide high level features like scroll_into_view(). Having access to the full functionality of the underlying webdriver from anywhere that has a reference to the session is pretty powerful.

The actual WebDriver is now just a thin wrapper that provides access to the session methods via Deref. (Previously this functionality was exposed using async_trait but that made the docs horrendous to read).

Another reason for passing around a reference to the session is that it then guarantees at compile time that you don't try to "use" the session after the browser is closed. I don't know if there are issues with this in practice and it does mean dealing with references and thus lifetimes, but so far it's ok.

I see that fantoccini also now has functionality to wait for various conditions and also to customize the waiting behaviour. There is probably room to combine the best of both feature sets there.

The problem is that the differences are subtle and run deep, so it really does require some experimentation. I'd be keen to open an experimental branch on either project and start playing around with ideas. The road might be a long one but I think it's worthwhile in the end. I'm a pragmatist at heart so as long as we end up with something that provides a great user experience then it's a win.

I'm sure neither of us have unlimited time to work on this as well, but we have to start somewhere. And with enough community support from both sides hopefully this could move to a more official project with more than 1 maintainer. That would be cool.

@stevepryde You mentioned to me before that the interface was a port of the python bindings, but that doesn't extend to the query and wait interfaces does it? I have to say I've grown quite fond of the query builder, as it keeps locators simpler/reusable and feels natural to write, although I understand they currently require extra requests to the WebDriver server. I know there's the W3C WebDriver spec to stay in line with, but it's nice to see different APIs built on top of it, and I basically never type driver.find_element anymore. I bring this up because in my mind a shared interface which maps to either the thirtyfour or the fantoccini "base" APIs might be an easier place to start collaboration, and as a user it stands out to me as a difference as much as the choice of http client, although I don't work on exceptionally large testing suites.

Correct. I started out using the python bindings as a guide because I had some good experience with those and knew it was a good base to start from. The query interface was similar to something I had built previously on top of the python bindings although I had the chance to refine it a little more this time. I really like this extra layer of abstraction and find it really useful as well. Whether it needs to be in the same crate or a separate crate is up for discussion.

As for performance yes it does do extra requests but it's common to run tests with a webdriver bundled locally or at least on the same network so the requests should be fast.

That said, something that's on my wishlist is either an xpath "ORM"/builder or better yet a macro that can check xpath queries at compile time to ensure they are syntactically valid.

I've had a deeper look into the code. Here's a summary:

Fantoccini pros:

  • Error handling seems much nicer in fantoccini. I like the way everything is grouped into sections.
  • Lots of links to the official W3C spec in the docs. Very nice.
  • Uses the webdriver crate (maintained by mozilla) types for commands etc.
  • Integration with the cookie-rs crate
  • Form handling
  • Access to the inner HTTP client (may be useful for testing download links etc)
  • Clean / clear API (actually the two crates are quite similar to use from an end-user perspective)

Features of thirtyfour that are not (yet) in fantoccini:

  • Support for Chrome Devtools Protocol (this also includes selenium 4.x support)
  • Action Chains
  • Element query / waiter interfaces (the wait() method provides limited support)
  • Browser extensions
  • Slightly more features for <Select> elements
  • Better support for sending key combos to elements (e.g. elem.send_keys(Keys::Control + "a").await?;)
  • Specific helpers for setting up browser capabilities, e.g. enabling headless, adding Chrome options, setting debugger address etc
  • Code examples in docs
  • Tests! thirtyfour has doctests for most methods and these also serve as example code. Granted this is difficult and the approach thirtyfour uses for tests definitely has its drawbacks, but it does work pretty well.
  • CI/CD? The link to Travis for fantoccini is currently broken. What about Github Actions?

WebDriver commands not yet supported by fantoccini:

(apologies if any of these are actually supported, it's meant to read as a TODO list not a criticism)

  • Status
  • GetTimeouts
  • SetTimeouts
  • Forward
  • GetTitle
  • MaximizeWindow
  • MinimizeWindow
  • FullscreenWindow
  • IsElementSelected
  • IsElementDisplayed
  • GetElementCssValue
  • GetElementTagName
  • GetElementRect
  • IsElementEnabled
  • AddCookie
  • PerformActions
  • ReleaseActions
  • DismissAlert
  • AcceptAlert
  • GetAlertText
  • SendAlertText

All of the above (both commands and features) should be relatively straightforward to add. There are discussions to be had regarding interface decisions and where things should live (even whether they belong in fantoccini or in another crate), but I don't see any technical blockers here. I'm happy to do it but also fine if anyone else wants to port stuff across. The two projects use the same licenses anyway.

Features of thirtyfour that might be difficult to add to fantoccini:

  • Support for other HTTP clients. Need to assess whether this is actually an issue anyone cares about. It's worth noting this also implies support for other async runtimes. Fantoccini currently requires the tokio runtime.
  • All elements and other structs that use the session (other than the main client itself) contain a reference to the session, so that Rust will verify at compile time that there are no uses of the session after the browser is closed
  • Configuration stored in the client struct. The ElementQuery interface was originally added via a separate crate and I was able to easily extend WebDriver through the use of a trait. However it needed a nice way to be able to set configuration defaults on the session so that every call to WebDriver::query() could remember those defaults. It should be possible in fantoccini too but probably out-of-scope here.

Also note many of the missing features etc can be added piece by piece, which is much easier. No need for any big changes.

I notice @jonhoo s comment in the reddit thread,

As for merging, I think what we would end up doing is going with one design or the other, rather than trying to combine them, since as you say it's not clear how we'd really do that. One option short of that is to try to find a way to share a test suite. Abstract the tests over some common API that both crates can "provide". Might help uncover bugs that are in one but not the other, I'm not sure?

and @stevepryde s comment

That said, something that's on my wishlist is either an xpath "ORM"/builder or better yet a macro that can check xpath queries at compile time to ensure they are syntactically valid.

and can't help but think these tasks are suited to one another, as a builder that simply composed an xpath under the hood would be fairly straightforward to use in either crate. Certainly I think a macro for validating xpaths and/or css selectors at compile time would be useful to embed in the locator constructors for either crate, in fact I think I'll explore that a little and follow up if I get somewhere. But @jonhoo I'd be interested to hear your thoughts on what a common api would look like?

Thanks for driving this forward @bcpeinhardt, and thanks for the detailed thoughts @stevepryde! I had completely forgotten that Reddit thread πŸ˜…

I'll leave some thoughts in semi-random order.

HTTP client support: I don't think it's important for a WebDriver library to be generic over the mechanism it uses to interact with the WebDriver host β€” that should be an implementation detail the caller should not care about. Async Rust is in this slightly awkward spot at the moment where the choice of HTTP client also dictates the async runtime, which is definitely unfortunate, but I also haven't seen good solutions to this problem. So, my instinct for the time being is to stick with by far the most widely used HTTP client (hyper, which reqwest also uses under the hood), and not add the unnecessary extra complexity to the implementation and interface that being generic adds. My hope is that down the line it'll be easier to be runtime-agnostic, but right now I don't think we have a way to be that's worth the cost for a library like this one.

Handles by reference: While it's true that fantoccini doesn't have its handles (like element handles) take a reference to the original session, it does ensure that handles don't outlive the session simply by passing along the Client itself into the handle. Essentially, the session isn't closed until all handles go away (reference counting essentially). You can see this in the existence of Element::client for example. In fantoccini, Client is Clone, which I think is an ergonomic upside compared to having handles require a living reference. But I could go either way on that tbh, and am not married to that approach.

Ergonomic accessors: fantoccini is intended to be a low-level binding to WebDriver. My thinking (which I think remains the same) is that something that provide, say, scroll_into_view or fancy query mechanisms beyond those of the standard, belong in a library that wraps fantoccini, rather than in fantoccini itself. Now, those two libraries could be co-owned and co-released, but I think having that separation is worthwhile. I almost wonder if thirtyfour could switch to fantoccini "under the hood" or something like it?

Maintenance burden: I'm finding myself with precious little spare time these days, which has definitely impacted my ability to responsibly maintain this project. I try to do what I can to stay on top of things, but I definitely would love help in maintaining fantoccini/handing it over/merging it with another project. And I realize that with that I'd also be handing over some of the authority to decide what's "right" for the library β€” I'm fine with that :) I'll do what I can to help along any effort to move to one project (possibly with two crates in it as outlined above), but may not be able to invest much actual implementation time.

Ok, so @jonhoo, just a couple of questions:

  1. Would you be ok with limiting fantoccini strictly to the W3C spec then?
  2. What about support for CDP (Chrome Devtools Protocol, which adds a lot of stuff now officially part of selenium 4)?

Repurposing thirtyfour as a batteries-included layer on top of fantoccini seems like a good path forward.
I'm ok with dropping support for other HTTP clients. I agree that in practice it really shouldn't matter.

Regarding the handles by reference thing, from what I can tell in fantoccini if you get a reference to an element, then close the browser, then try to access that element, Rust would be fine but the webdriver (geckodriver or selenium etc) would throw an error at runtime. The difference with thirtyfour is that Rust would catch this at compile time as an error.

What you might be referring to is the case where you lookup an element, and then your Client goes out of scope, and then you try to access that element. In fantoccini this works like you said because the "session" effectively lives until the last access of it. However this only works if you don't close the browser explicitly, and then the browser will remain open after your program exits.

In thirtyfour it is assumed that the browser session lifetime is the same as the WebDriver struct lifetime, and thus any access of the session after WebDriver goes out of scope is considered an error. This is intentional, but I can see how this might not be always desirable in cases where the browser is never closed (you can still opt not to close the browser, but the WebDriver struct must still stay alive until the end of the program). It would be just as easy for thirtyfour to just clone the session struct everywhere (it's just a channel sender), and then you'd have the same behaviour as fantoccini. I won't actually do this though, since thirtyfour will eventually just use fantoccini directly, as you suggested.

Btw both libraries cannot have the browser automatically close on Drop, due to there being no async destructors. This is unlikely to change anytime soon (if ever).

So my takeaway from here is that the path forward looks something like this:

  • Implement remaining W3C functionality in fantoccini (can probably port a lot of it from thirtyfour)
  • Refactor thirtyfour to use fantoccini under the hood, and provide higher-level methods on top

Sound like a plan?

  1. Yes πŸ‘
  2. That's tricky. One the one hand, I feel like that should be a different crate altogether. But on the other, if it shares a lot of interface with WebDriver, maybe it makes sense to let fantoccini be an abstraction layer across both? I'm not familiar enough with CDP to say just how much would be shared between the two. If CDP is strictly more powerful than WebDriver, I think it might make most sense to have fantoccini support both, with CDP features only being provided for clients connected to a CDP session (enforced by the type system).

Ah, sorry, yes, for the explicit close case, you're right. Whether that's a feature or a shortcoming I guess depends on the kind of foot-gun you prefer to avoid. I can definitely see the argument for enforcing correct operation through lifetimes though, and would be okay with moving fantoccini in that direction πŸ‘ The original version of fantoccini predates async/await, where it was basically impossible to work with async interfaces that had lifetimes in them, so it's partially a leftover from those days. One consideration is that it's possible to build the lifetime-tracking interface on top of a reference-counted session, but not the other way around (I don't think). But maybe that's okay β€” maybe it's rare that anyone actually needs to be able to have a 'static element handle.

I like that plan!

I'm only vaguely familiar with CDP at this point but have used a few bits of it in the past. For example, CDP gives you access to the network requests and I believe you can even use it to proxy requests. It does many things though and is arguably bigger than the WebDriver spec. The good news is that it follows the same kind of protocol so you'd basically have a separate Command enum for CDP and use all of the same infrastructure for it. Some of it will require specialized interfaces for better ergonomics, but the low-level stuff is all the same as for WebDriver.

I too wondered if it could go in a separate crate. However moving forward I'd guess that users are either going to be targeting selenium (which now includes CDP, at least in part) or chromedriver/geckodriver (both of which support CDP - I believe it's still experimental in geckodriver but officially supported).

As such users won't really care too much about whether they're using the original W3C commands or the CDP commands - they'll want both supported out-of-the-box. So I think it makes sense to add CDP support to fantoccini directly, and expand the scope to cover WebDriver + CDP.

Unfortunately CDP is not a W3C standard. It originated as a Chrome-specific thing (chromedriver was actually implemented on top of it). Puppeteer and Cypress use CDP directly. And now Selenium 4 uses it too, which means Firefox has had to implement it (I believe several Selenium devs are also Firefox devs).

It looks like the path forward after CDP will be called WebDriver BiDi: https://developer.chrome.com/blog/webdriver-bidi/
Just something to keep an eye on down the track. For now WebDriver + CDP is all there is.

Let's move discussion of xpath to a separate issue on the thirtyfour side

Just a quick update. Fyi only. No action needed.

After a fair bit of hackery I've got (a branch of) thirtyfour in a place where it uses fantoccini underneath, but with lots of functionality missing. It's not as bad as it sounds. Mainly it's just missing wiring for a bunch of the WebDriverCommands.

I've also got a branch on my fantoccini fork that adds all of the missing commands in the enum, but I haven't yet added public methods to use them. It's probably in a good spot for a PR though.

Everything currently compiles, but with lots of unimplemented()! scattered around. However, it's enough to run some tests and it can at least start a session, find elements, type into them, and click, etc. A good first step I think.

Next step will be to add the public methods to fantoccini in order to allow all of the commands to be called.

This transition will be completed in thirtyfour v0.29.0, which will be released shortly after fantoccini v0.19.0 is released.

We're at 0.x anyway, so I just released 0.19.0! πŸŽ‰

thirtyfour v0.29.0 has been released. This issue can be closed πŸ˜„

FYI thirtyfour v0.30.x will introduce major API changes (mainly method renames) to bring everything more in line with fantoccini, such that using either one should feel almost identical. The existing method names are still there but marked as deprecated. This is in line with my goal to make thirtyfour feel like "fantoccini plus extensions". This also goes some way to supporting a v1.0 for both projects in future, once these apis are deemed "stable". That gets complicated now that the WebDriver spec is being updated but we'll see how we go.

Amazing, thanks for the update!

Actually, @stevepryde, maybe it'd be a good idea to add a mention of thirtyfour to the fantoccini README?

That would be cool but it's up to you. I'd also like to add more of a description on the thirtyfour README for how the two crates interact. At some point I'd like to try implementing all the thirtyfour features as traits to extend fantoccini types. That would be pretty cool I think. I like the idea of using fantoccini then importing traits from thirtyfour for any extra bits someone wants to use.