mkoehnke / WKZombie

WKZombie is a Swift framework for iOS/OSX to navigate within websites and collect data without the need of User Interface or API, also known as Headless browser. It can be used to run automated tests / snapshots and manipulate websites using Javascript.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Get all HTMLRows in HTMLTable on page

CaptainMack opened this issue · comments

Hi,

I'm trying to convert my backend scraper to an on-device scraper using your library, but I'm finding it hard to understand the documentation without examples, how would one use WKZombie to login and get a HTML Table, and how do you debug it? I can't seem to get document.title to print anything other than nil. I'm opening an URL that will redirect me to a page, where I can login.

I have the following code for now:

func testScraper(_ url: URL, user: String, password: String) {
    open(url)
        >>> get(by: .id("username"))
        >>> setAttribute("value", value: user)
        >>> get(by: .id("password"))
        >>> setAttribute("value", value: password)
        >>> get(by: .name("f"))
        >>> submit(then: .wait(2.0))
        >>> getAll(by: .contains("class", "DataSelect"))
        === handleResult
}

But I'm just getting a "error loading page: Not found" when printing the result, but I can't debug it without knowing if it has been redirected before It tries to get the table and forth? And do you recommend using browser.* instead of the >>> operators? I'm having a very difficult time adjusting the example code to my own use-case, as I keep getting errors like "operands here have types 'Action' and '([HTMLElement]) -> ()'.

All the best,
Christian

Hi Christian,

Thank you for trying out WKZombie. You're absolutely right. The documentation and better debugging is something that is always on the TODO list, but you know how it is 😬. Will try to improve that soon.

Anyway. Do you have the Logger enabled? If so, you should see output like this:

..

.SCRIPT
getElementByXpath("//*[@id='username']").setAttribute("value", "..."); document.documentElement.outerHTML;
[]

.SCRIPT
getElementByXpath("//*[@id='password']").setAttribute("value", "..."); document.documentElement.outerHTML;
[]

.SCRIPT
document.form2.submit();
[..................................]

...

This should give you a hint of what action was executed before it failed. Another way is to call the dump() method, which prints the current content (e.g. HTML) to the console.

What other way of debugging would you like to see in WKZombie? I am always open for suggestions and pull requests of course 🙃

Hope that helps,
Mathias

Thanks for your long and helpful reply @mkoehnke,

Logging and debugging has helped me quite a lot, but I'm stuck on how I can proceed from doing a browser.execute() call, which returns a JavaScriptResult. My code is:

>>> browser.execute("document.getElementsByTagName('input')[0].click()")
>>> browser.getAll(by: .XPathQuery("/html"))

and any reason why webpages treat WKZombie as a javascript-disabled browser? I'm essentially trying to submit a form (which I seemingly can't via. browser.submit. The HTML is as follows:

<form method="post" action="https://wayf.wayf.dk/module.php/saml/sp/saml2-acs.php/wayf.wayf.dk">
    <!-- Need to add this element and call click method, because calling submit()
on the form causes failed submission if the form has another element with name or id of submit.-->
    <input type="submit" style="display:none;"/>
<input type="hidden" name="SAMLResponse" value="longassid"/>
        <noscript>
            &lt;input type="submit" value="Submit" /&gt;
        </noscript>
    </form>

As noted in the comment, the first submit input will fail, if there's two submit inputs in a form - which I guess is why browser.submit fails? the code block prints without encoding.

Apologize for all my questions, but I'm trying to wrap my head around this whole new way of doing it on the device ;-)

Run the inspect command after your execute. I had this same issue with an angular form and model

execute("loginPassword.value='password'")
execute("angular.element($('#loginPassword')).triggerHandler('input')")
execute("angular.element($('#submitButton')).triggerHandler('click')")
inspect

Hope this helps

Hi Christian,

I am so sorry about the late response. I have too much on my plate right now :-(

Did you find a solution for this issue?

Can you elaborate a bit more on why you think that webpages treat WKZombie as a JavaScript-disabled browser? Also, the form-snippet you've posted, is that the actual HTML source code or did you modify it to try working around that issue?

I would like to reproduce this issue on my side to hopefully come up with a solution.

Thanks!

Hope that helped. Will close this ticket for now due to inactivity. Feel free to reopen it if you're still seeing this issue.

commented

Submit method wouldn't work on above mentioned form as the function requires form to have id or name attribute.

HTMLForm.swift:69

internal func actionScript() -> String? {
    if let name = name {
         return "document.\(name).submit();"
    } else if let id = id {
        return "document.getElementById('\(id)').submit();"
    }
    return nil
}

Bumped into similar issue. Maybe someone will find this info useful.