clj-commons / etaoin

Pure Clojure Webdriver protocol implementation

Home Page:https://cljdoc.org/d/etaoin

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sane/Easy way to get HTTP-Status code for navigation/go

rawoke083 opened this issue · comments

HI all

Is there an "easy" or recommend way to get the http-status code from navigating to a URL with go cmd ?

Only reference I found was to a long stackoverflow post from 11 years ago ?

I.E How do get/know when the (e/go driver url) returned a 500 or 404 ?

Hi @rawoke083, thanks for raising this issue.

Etaoin is a thin wrapper atop WebDriver.
It looks like the WebDriver team has decided not to expose HTTP status codes.

The recommendation is to employ a proxy.

But maybe a simpler approach, if the site you are hitting includes enough distinguishing info, is to inspect the resulting page.

Yeah, this reflects Selenium's advice on the topic.

Please let me know if this answers your question.

Hmm ok, what is the "recommend way to do error handling ?"
If say a page return 500 or domain-name is not valid ?

The way I understand it:

  • the designers of WebDriver felt that it should see/expose what the regular user of a browser sees (a user who might open view source but doesn't open dev tools).
  • they also did not want to deal with a hodge-podge of technical hacks to support fetching the underlying HTTP status from different browsers

Looking at the issue I linked above, lots of folks disagreed with this decision, but it is what it is. There is no point in debating that decision here.

Looking at page content

I think the idea is if you are navigating to page X, you are probably expecting some specific content. If you don't find that content your script/flow will fail.

But, if you need/want to detect an error page, for example, a 500 error, the resulting page will likely have some content that distinctly indicates such to the user. It might even have some hidden markers. You would parse the page and look for the distinguishing data.

Your example of a domain name being invalid is interesting.
In this case, the web browser itself will present some content.
For example, today, for this:

(e/go driver "https://foobar.clojure.org")

Chrome will present an error page like this:

image

If you wanted to detect that you'd ended up on this error page you might:

(e/has-text? driver "DNS_PROBE_FINISHED_NXDOMAIN")
;; => true

An issue is that each web browser will present a different error page. Here's today's equivalent for Firefox:

image

And I suppose these browser-presented pages could change over time with new browser releases.

Looking at web traffic

The Selenium folks suggest using a proxy if you want to snoop into HTTP traffic.
I've not explored this yet, but if there is enough interest I could take a stab at it and maybe update the user guide.

Another avenue to explore would be looking at Etaoin's support for Chrome DevTools to see if it provides the info you might be looking for.

It's been a while and I see no interest.
Dear future reader: please feel free to re-open if you feel the need.