betatim / notebook-as-pdf

Save Jupyter Notebooks as PDF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

When rendering many notebooks some have a visible MathJax overlay

eiclu opened this issue · comments

Hi, when converting multiple Jupyter Notebooks to PDFs in a row, the first document to be converted always contains this in the bottom left corner, hiding some of the content:

image

That is annoying. notebook-to-pdf currently waits for the network to be idle as a way to determine that "everything has loaded". But there were always going to be cases where something slips through.

I wonder if there is a way to check what we actually want to know "everything has loaded and is rendered" instead of a proxy measure like "the network has been quiet for a while, things must be ready then, i guess". I don't have a super good idea though. Do you? Maybe there is a way to run some JS to check if all the MathJax equations have been rendered?

Maybe something like this could work.

We could fetch the pages HTML periodically and wait until there are no changes in the HTML code any more.

But maybe only as an extra flag, since this could significantly increase the execution time of each convert action

What do you think of using https://miyakogi.github.io/pyppeteer/reference.html#pyppeteer.page.Page.waitForFunction with a small JS function that returns true when the MathJax overlay is not visible?

From what I understand, you can force mathjax to render everything with MathJax.Hub.Queue(["Typeset",MathJax.Hub]);. It adds a task to render all the math in MathJax's queue. Then you can add another task to the queue which will act as your callback function. Simple snippet below - essentially if callback_func is invoked that means rendering has finished.

callback_func = function(){...};
MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
MathJax.Hub.Queue(callback_func);

That does look like what we need!

How would we use this? In pypetteer we have a way of setting up a JS function that polls (waitForFunction) to decide if things are done or not. MathJax lets us add a callback that gets run when it is done. How do we connect the two? A global variable that gets set by the MathJax callback and checked by our pypetteer poll'er?

I am not very good at JS and what patterns to use/not use, so ideas (or complete solutions) would be super welcome.

A global variable that gets set by the MathJax callback and checked by our pypetteer poll'er

That's the simple/easy way to go about it. It's also the suggested way on MathJax mailing list.

I'm also not a pro at JS but given it's such a small thing (that can be changed anytime) I'd go this route without worrying too much about the "right" way :)