guigrpa / docx-templates

Template-based docx report creation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance issue: Freezing and long processing time when passing large amount of data

R-404 opened this issue · comments

commented

I am encountering a performance issue when attempting to generate a report using the CreateReport function in our React application.More Specifically, when passing a large amount of data to the function, the react app freezes for some time and takes excessively long time to generate a word file.

The data I need to pass to the template is of object type with length that ranges from 100 to 2000

I've come across similar issues here like #153 and #81 and tried out the nosandbox fix, it seems to reduce time a bit but I have some additional logic inside the template file for translation and stuff which then doesn't work with the nosandbox set to true.

I have an test barebone example here
I have also added a similar template in the sandbox files public folder you can also check that out

Steps to test and Reproduce:

  • just click on generate report button
  • app freezes for some time and then creates a word file
  • setting noSandbox to true throws an error as there is some similar test logic in the actual template I'm using.

Workaround:

  • noSandbox set to true works only if there are no logic inside the template
  • currently I've offloaded the process to a worker and run it in background but it still takes some time after starting the process for the app to be responsive again

Additional Details:

React version : 17.0.2
docx-templates version: 4.9.2
Dataset size used for testing: object with length upto 2000
OS: ubuntu 22.04
Hardware specifications: i7 8th gen, 16gb RAM, 256 SSD

Please let me know if you have any possible fix or even workaround I could try to fix this, also if you require any additional info just let me know. Thanks

Hi, thanks for your extensively documented issue.

Note that each command is invoked using the below code. If you have noSandbox set to true, each command is executed by eval(), which is a lot faster than it being executed by vm.Script (make sure you understand the security implications, see README).

if (ctx.options.runJs) {
const temp = ctx.options.runJs({ sandbox, ctx });
context = temp.modifiedSandbox;
result = await temp.result;
} else if (ctx.options.noSandbox) {
context = sandbox;
const wrapper = new Function('with(this) { return eval(__code__); }');
result = await wrapper.call(context);
} else {
const script = new vm.Script(
`
__result__ = eval(__code__);
`,
{}
);
context = vm.createContext(sandbox);
script.runInContext(context);
result = await context.__result__;
}

You noted that you have already tried noSandbox: true and it didn't help enough. That makes sense; if your dataset is very large, executing code from within the template is still significantly slower due to all kinds of overheads involved.

One thing to keep in mind is that JS is single threaded and works with an event loop. https://www.digitalocean.com/community/tutorials/node-js-architecture-single-threaded-event-loop
In practice, this means that running an expensive computation within the 'main loop' causes all other interactivity of the application to freeze until the computation is complete. This is a horrible user experience for web applications.

One thing you should definitely try is to run createReport in a web worker. You can also try to take as much logic out of the template, as running code inside of the template can be slow (especially loops).

commented

Hi @jjhbw, Thank you for your time and quick response. I have implemented your suggestion by adding a web workers with the createReport function. This has significantly improved the performance, allowing us to use the application in the background during the report generation process.

I wanted to look into possibilities to improve the speed of the report generation process. Therefore, I have created this issue to discuss and investigate potential fixes/solutions. I will look a bit more into using noSandbox and its implications on security.

Once again, thank you for your assistance.

Good to hear that! If you want to spar some more on performance considerations, just @ me in this issue.