AngleSharp / AngleSharp.Js

:angel: Extends AngleSharp with a .NET-based JavaScript engine.

Home Page:https://anglesharp.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory leak in Jint caused by DomFunctionInstance

ArcanoxDragon opened this issue · comments

I'm currently writing a web automation library, and a large set of NUnit unit tests to go with it. Each test loads several pages and runs several scripts inside a BrowsingContext, which gets released as soon as the test finishes (all paths from roots to BrowsingContext are nullified). Despite this, the memory usage of the ReSharper test runner process grows astronomically as the automation tests are processed. Upon taking a heap snapshot, I can see that the majority of memory used is used for AngleSharp.Scripting.JavaScript.DomFunctionInstance. Below is a screenshot showing the reference counts for this class.

Dictionary<String, Jint.Runtime.Descriptors.PropertyDescriptor> References

Every time I load a new IDocument into the BrowsingContext, I call .Dispose() on the currently loaded one to ensure it properly releases any unmanaged resources, and as soon as the new document is loaded, all references to the old document should be collected by the GC. However, even if I manually call GC.Collect(); at the end of each test, the process memory grows by about 200-400 MB for every test that runs. By the time I'm halfway through my test session, the process is using 15 or so of the 16 GB of RAM I have, and the machine becomes unresponsive.

I know that the Jint Engine instance for each IWindow is stored in a weak reference table, so when I dispose of my browsing context (by releasing all references to my class that stores the only instance of it), the BrowsingContext's IWindow should become stale and be collected by the GC, thus also releasing the Engine instance. For some reason, however, it doesn't appear that this frees up the DomFunctionInstance instances.

The top of the Paths to Root for Jint.Runtime.Descriptors.PropertyDescriptor is Jint.Native.JsValue, leading me to believe there are references stored somewhere that can still be accessed once the Engine instance is disposed of, or that the Engine instance is somehow not being disposed of when the BrowsingContext goes out of scope.

Yeah that caching is hitting us hard here I guess.

I think we need a good and well-though out proposal to improve the situation here (i.e., keep the performance and avoid memory leaks).

I think this is a great idea @georgiosd!

I don't have any experience with it but if you add anything, please comment here and I'll have a play too :)

I've been doing some more looking into this with the VS2017 memory diagnostic tools. I can't get a ton of useful information with them, but it seems there are a lot of multi-circular/spaghetti references between DomEventInstance, DomFunctionInstance, DomPrototypeInstance, and EngineInstance. Even after I've completely Disposed of all my IDocuments, BrowsingContexts, etc, there is still a ton of memory consumed by the process. If I run 20-30 of my unit tests, the process memory is up to 12+ GB and my workstation starts choking from all the swapping going on. Each unit test opens up a BrowsingContext, runs some workflows which load anywhere between 5-15 pages (by following links, clicking buttons, etc), then disposes the BrowsingContext. As each Unit Test is a top-level function entry point, the whole chain of BrowsingContexts and IDocuments created by each unit test goes out of scope as soon as the unit test finishes, but the GC doesn't seem to be able to identify that certain instances are truly out of scope, which is probably why the memory climbs half a gig for each test. Right now, it looks like the only way I can solve the memory leak problem is by launching a new AppDomain from each unit test so that I can unload it and release the memory completely when the test finishes, but I would really rather not do that. That solution won't solve the problem in production, either.

I don't know what kind of tests you have, but 12 GB after just 20-30 of them does not sound right at all. The tests in this lib do not come remotely close to this number and each one uses jQuery (plus of course the whole DOM which is already quite heavyweight).

There seems to be something holding circular references to Window, which would explain the high memory usage. The weak reference table used to hold Engine instances is keyed on Window instances. Here's a screenshot of a memory dump after running my workflow tests. Keep in mind my workflows involve navigating through a lot of pages (dozens), and it seems each page gets stored in memory instead of being released once it is unloaded. I don't hold any permanent references to IDocument or IWindow in my code; just a BrowsingContext.

Screenshot

For some more information, the memory leaks only seem to occur when I have resource loading enabled (automatic downloading and executing of <script> tags). If I disable resource loading for script files, it appears the memory is properly cleaned up and doesn't climb exponentially as my workflows run. However, this means my workflows don't work properly as the pages obviously require their external scripts in order to work correctly.

I see - thanks for the info!

Okay...I've been doing some more digging yesterday and today. Everything I've found boils down to ScriptRequestProcessor, HtmlScriptElement, and TaskEventLoop. This coincides with my observation that the leak only happens when external script resources are being loaded.

I'm wondering if there's a closure leak here...I found one in my own code earlier where I was capturing an IDocument in a JavaScript handler (Action passed to Jint that captured an IDocument). I'm still trying to find out if this leak is caused by something I'm doing while calling AngleSharp or Jint, or if it's definitely a bug in AngleSharp.Scripting, but there are definitely an inexplicably high number of ScriptRequestProcessor instances left over after my workflows are finishing.

I don't think this is a bug in Jint. I think we can optimize AS.Scripting definitely, but I would not exclude a leak some closure of AS core.

Big discovery incoming!

I think I have solved the biggest culprit of this memory leak. It seems the websites I am scripting are setting copious amounts of timeouts with setTimeout or setInterval with excessively long timeout values, which is not something I am able to avoid. I don't know which scripts are doing this so I can't pick-and-choose which to run.

Due to these long-running timeouts, all of the long Task.Delay calls on Window.cs:642 are causing literally millions of Task.DelayPromise instances to hang around, which in turn hold references to Action<Window> which of course hold references to Window in their closure. This causes the Window and all of its Jint values to stay in memory until these DelayPromises are resolved.

Thankfully, the solution is relatively simple:

  1. First I made AngleSharp.Dom.IWindow extend IDisposable.
  2. Next, I made Window implement the inherited Dispose method from IDisposable, and inside it, I simply cancel all tasks associated with the Window's Document:
#region Disposal

public void Dispose()
{
    var timeoutTasks = _document.GetAttachedReferences<CancellationTokenSource>();

    foreach ( var task in timeoutTasks )
        task.Cancel();
}

#endregion

This causes any waiting Task.DelayPromise instances to resolve into the cancelled state, which falls into the continuation in Window.DoTimeoutAsync and eventually releases the Task.Delay reference chain, meaning the Window is no longer referencing itself due to unexpired timeouts.

Now, in Document's implementation of Dispose, I simply dispose of the Document's view if it exists:

public void Dispose()
{
    //Important to fix #45
    ReplaceAll(null, true);
    _loop.CancelAll();
    _loadingScripts.Clear();
    _source.Dispose();
    _view?.Dispose();
    _view = null;
}

(the last line requires removing the readonly keyword from Document._view...it may not be necessary to forcefully nullify the reference to the Window but I am doing that in my case to be safe).

After making these modifications, my application's memory is properly collected after I dispose the BrowsingContext as all references to Document are released, which propagates down now that Window is no longer dominating itself.

My AngleSharp workspace is slightly off-branch from devel because I changed the xproj to a .NET Core csproj to work with VS2017, so I'm not sure if a pull request from my fork would merge 100% correctly, but hopefully there's enough information here to implement a fix if this is indeed a valid solution.

That is indeed super useful information.

As far as the new csproj is concerned: I think we actually want to migrate from xproj + project.json to the new csproj (i.e., go back to csproj with the new abilities there). So if you think that would be a blocker - it wouldn't be (actually it would be another helpful thing).

Much appreciated for the insights 👍 🏅 !