Ruby rb tags ignored by extractHtml.js
MichaelPetre opened this issue · comments
If you try to convert a Japanese webpage containing ruby tags, the rb tags are ignored by the parser.
<ruby><rb>私</rb><rp>(</rp><rt>わたくし</rt><rp>)</rp></ruby>
gets saved as
<ruby class="MG357"><rt class="WF360">わたくし</rt></ruby>
As a result, you have the ruby furigana but are missing the kanji in the epub file.
Expected output:
私
Real output:
This is caused by line 15 of extractHtml.js:
'dfn', 'em', 'i', 'img', 'kbd', 'mark', 'q', 'rp', 'rt', 'rtc', 'ruby', 's', 'samp', 'small', 'span',
Adding the rb tag solves the issue:
'dfn', 'em', 'i', 'img', 'kbd', 'mark', 'q', 'rb', 'rp', 'rt', 'rtc', 'ruby', 's', 'samp', 'small', 'span',
Fixed in pull request #56