Discrepancy between firefox reader mode to readability library.
eliyabar opened this issue · comments
Hi all,
I have an unresolved problem. I'm figuring out why, in some cases, I get the desired view I want, a full article in Firefox reader view.
but using node js, with JSDOM, resulting in only part of it.
this is the article I'm using:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6540991/
Firefox is parsing it all, from top to bottom.
Using node-js with this library, I get only the 'References' part of the article.
Are there any more configurations that I should consider?
this is my code:
const jsd = new JSDOM(doc,{
url,
referrer: url,
contentType: "text/html",
includeNodeLocations: true,
pretendToBeVisual: true,
})
const readable = new Readability(jsd .window?.document, {debug: true }).parse()
using the readability library:
Thanks!
Just a note since i had a similar issue:
If you refresh your example page in ff with reader mode already active, the output for the content is the same as in your second screenshot.
My issue for a different site was that certain elements were moved by JS after initial page load, when activating reader mode ff used the currently available HTML.
When loading the page with reader mode already active, ff used the HTML as provided by the initial request and scripts were never executed.