zotero / citeproc-rs

CSL processor in Rust.

Home Page:https://cormacrelf.github.io/citeproc-wasm-demo/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"<" is ignored in prefix

tnajdek opened this issue · comments

Some styles use "<" character in prefix (encoded as &lt;). This seems to be ignored and doesn't render.

Prefixes of what kinds of things? I can't reproduce this on a <text prefix="&lt;" variable="title"> or URL, but there could be another element that has this problem. (It occurs to me I should make it so the demo/playground can give you shareable links.)

image

Sorry, it was quite late here when I've seen this and I just wanted to note down the issue so I don't forget.

I've seen this behaviour for DOI, it seems to happen regardless of link_anchors value. Here is an example CSL (from MHRA):

<text variable="DOI" prefix=" &lt;https://doi.org/" suffix="&gt;"/>

And here is screenshot from the playground:

Screenshot 2021-10-30 at 11 06 16

Oh, I see it. It is over-parsing the affixes, because the superscript parser is only used inside the HTML parser. It needs to parse the hacky superscripts, but not actual HTML. The HTML5 parser used in citeproc-rs dutifully (and I presume correctly according to that spec) ignored an incomplete/invalid <https: tag and swallowed the rest of the input.

The affixes have their XML entities pre-processed, so that part is already done, the HTML parser doesn't need to be involved at all.

This is a good catch, thanks.