NiklasGollenstede / epub-creator

Firefox add-on that creates .epub books from the about:reader and overdrive books

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error exporting forum posts to epub

wolftune opened this issue · comments

I found that Firefox reader mode works with posts in Discourse, for example https://meta.discourse.org/t/discourse-moderation-guide/63116

However, trying to export an epub of that gives: "TypeError: parsed is null"

Ugh, yes.

Firefox no longer lets extensions access the content of the reader mode pages, so this extension has to parse the page again.
The reader mode parser is available as an open software component, so that is generally not too much of a problem

I first thought that was outdated, nut I think the actual problem here is that Discourse loads its content separately from the page. So this extension doesn't support that, so it gets pretty much a blank page, which the reader software can't parse.

For now I have added a proper error message.
I have two Ideas how to solve this issue, but implementing that will take some time.

Ok, this went fairly quick. I've created a new dev version.
Since I don't really use the feature myself, could you please test it over the next few days and tell me if everything works as expected, so I can upload it on AMO?
The extension should tell you what to do when it fails to generate the book.

Hey that worked! I got the error in reader mode:

The version of the reader mode included with ePub Creator - DEV was unable to parse this article.
This can happen for pages that load their content dynamically.
Please close the reader mode and try again! 

And when I tried not in reader mode, it succeeded.


Still, some glitches inherited from reader-mode. The following issues are present in reader-mode itself within Firefox, so they are not caused by the plugin.

  • When I first exported the example above (https://meta.discourse.org/t/discourse-moderation-guide/63116), it exported the top post but no images were included
  • When I tried scrolling down and exporting from a lower reply in the topic, I got just the first post again but this time one of the several images was included but most were missing (the included one was the topic timer image).

So, functional for all-text first posts. Not functional for reading a back-and-forth thread…

Upon investigation, I realized that I can sometimes get either that one image or no images in reader mode, so this is not happening at the plugin level.

I am glad to hear that! Ill upload the new version to AMO now, and will leave the previous about:reader workaround in it for now.


Yes I noticed that too. Its odd that only one of the images are included on the reader mode (notice though, that it is the only one that can not be expanded, and even that is marked as lazyload, which means unless you scroll down, it won't be loaded yet).
Readability.js (the library behind it) certainly isn't perfect, but considering what it gets to work with, its generally pretty good. In this case, I guess it just grabs the longest sequence of paragraphs.
If you have any concrete improvement suggestions for the parser, you can certainly raise them directly at Readability.js.