AngleSharp / AngleSharp.Js

:angel: Extends AngleSharp with a .NET-based JavaScript engine.

Home Page:https://anglesharp.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implict wait for content to load

s0thl opened this issue · comments

I just have discovered AngleSharp and I'm trying to learn about it.

I want to use it for testing my website, where content is dynamically generated.

Currently during loading, AngleSharp loads a document with the load spinner and not the actual loaded, ready document.

How can I implicitly wait during OpenAsync task for the element in the document to be loaded and visible?

AngleSharp is a pure engine and does not contain extra stuff such as a JavaScript (JS) engine, which would be required for your case.

There is an experimental JS engine, but I guess it will not work unless the used JS is very simple.

Please in the future provide at least some background info such as what configuration you are using (e.g., show the code you apply).

Hope that helps!

I understand that and I'm actually using AngleSharp.Js library, just need further instructions on how can I manage to solve the issue or at least try if it will work for me, because all of the actual google results lead to your replies saying "it will probably not work" without any context or details on where to start to give it an actual try or contribute if some code need to be improved.

This is my actual code:

var config = Configuration.Default
               .WithDefaultLoader(new LoaderOptions
               {
                   IsResourceLoadingEnabled = true,
                   IsNavigationDisabled = false
               })
            .WithJs()
            .WithCss()
            .WithRenderDevice()
            .WithCookies();

var context = BrowsingContext.New(config);
var document = await context.OpenAsync("https://mywebsite.com").WhenStable();

I can use my website backend to expose some JS variables like window.__isContentLoaded boolean, then, in AngleSharp I would want to spin until such variable is set and true. This should be simple enough I guess.

Could you elaborate on this topic?

It will not work because the JS will most likely crash. There can be various reasons for that:

  • Jint does not support some part of the used JS (note: Jint is fully ES5 spec compliant, but even ES5 spec compliant browsers implement some extra stuff that goes beyond that spec and is used / useful for some JS magic)
  • Some API is not implemented in AngleSharp (mostly some rendering API; we mocked a few of these in AngleSharp.Js already, but there is no guarantee that the mocked set of APIs is the desired one as well as how the mocked API is used)
  • Some behavior is different from AngleSharp to real browsers / JS engines (e.g., how does an event loop work and how do events work; a subtle difference could lead to big problems - we've seen that in v0.12 and earlier where the event loop was presumably correct, however, the initial JS was executed outside the event loop ... leading to potential race conditions in the JS execution)

Due to Jint's nature the exact source is not easy to debug. Usually, you need a fairly complex JS to trigger the issue. Then you need to boil it down to the least code required to still see the issue. And as a final step the code in AngleSharp / AngleSharp.Js needs to be adjusted (without breaking existing tests) to solve this particular issue. At the moment this is unfortunately quite tedious ...

You can - of course - give it a try (without going down the rabbit hole as described). The __isContentLoaded can work, but you could also spin until a certain element is there in the DOM (the latter would be more general and be equivalent to frameworks such as Selenium - they do the same in their "element visible" kind of APIs).

Since you use WhenStable I assume you are on the preview version of AngleSharp (and AngleSharp.Js respectively)?

Thank you for the detailed reply.

you could also spin until a certain element is there in the DOM

Can you please show an example of how such spin should be implemented? Been reading docs, but couldn't find appropriate methods for it.

Since you use WhenStable I assume you are on the preview version of AngleSharp (and AngleSharp.Js respectively)?

Not sure, since I installed all of the packages using nuget (there were two packages implementing JavaScript engine and I used AngleSharp.Js instead of AngleSharp.Scripting.JavaScript).

@FlorianRappl

I'm out of ideas - documentation doesn't include anything in this topic.

This one spins infinitely.. like the document is never updated:

var document = await context.OpenAsync("https://mywebsite.com").WhenStable();

await document.WaitForReadyAsync();

while (document.QuerySelector("form") == null)
{
    await Task.Delay(1000);
}

As I wrote - the document is most likely never updated as the JS crashes. I guess there is a chance you could find a log entry in the debug log about a JavaScriptException. Again, the documentation mentions that AngleSharp.Js is not ready yet for such tasks - it could work, but my feeling here is that it does not (also I haven't received any detailed infos on the script, e.g., is it based on some framework? which libraries does it use? how was it produced, e.g., ES target? ...).

This spinning here is not really AngleSharp specific (after all that's just a simple polling mechanism) thus is not in the documentation. I guess we could add a helper method (WaitUntilAvailable) and mention it in the docs.

Since you read the docs I assume you also know why there are AngleSharp.Scripting.Js and AngleSharp.Js (https://github.com/AngleSharp/AngleSharp/blob/master/doc/Migration.md#scripting).

Thus this is an infinite loop (and you should set a max. time - this is what frameworks like Selenium do; e.g., 10 seconds - you can easily achieve this with a CancellationToken which can be automatically fired after some time).

HTH!

Hi,

I haven't received any detailed infos on the script, e.g., is it based on some framework? which libraries does it use? how was it produced, e.g., ES target? ...).

Those are mostly ES6 webpack bundles scripts. Nothing fancy.

Since you read the docs I assume you also know why there are AngleSharp.Scripting.Js and AngleSharp.Js

Well, since AngleSharp.Scripting.Js is not compatible with v0.10 and I'm using latest available nuget version of AngleSharp (that is 0.12.1) I guess this is the way to go to use AngleSharp.Js?

Thus this is an infinite loop (and you should set a max. time - this is what frameworks like Selenium do; e.g., 10 seconds - you can easily achieve this with a CancellationToken which can be automatically fired after some time).

Infinite loop was just for the testing purposes. In case I was able to make it work, I would script that properly and perhaps create WaitUntilAvailable method.

In this case, if I have done everything properly and still unable to make it work and you don't have more solutions, then I think that maybe AngleSharp isn't the right tool for me at this time. Unfortunately, because I really hoped that it will be a brilliant lightweight solution for the automation testing. I still think I might dig deeper into the JS package to understand the issue and help improve it. Will do it in my spare time. 👍

Thank you for your engagement!

Yes I think this is the right way forward.

We have that use case on our roadmap, but unfortunately our resources are limited and the undertaking is massive (to say the least...).

I know you wanted to avoid a larger / more bloated solution, but at this point in time our recommendation for such a use case is definitely using browser automation. Note: This does not mean you need to use Selenium, but rather using the web driver spec (https://w3c.github.io/webdriver/).

Any contribution to AngleSharp.Js (e.g., just providing a case that does not work with a MWE to reproduce and fix it) would be much appreciated 🍻!

Reopen for the WaitUntil... helper methods and additional documentation.

Also for potential enhancements / bug reports on that / similar topics.

Landed in devel.