Juris-M / citeproc-js

A JavaScript implementation of the Citation Style Language (CSL) https://citeproc-js.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inappropriate hyphen conversion in `number`

bwiernik opened this issue · comments

If a report number is something like 114-11 (e.g., for a U.S. congressional report) and I render it with <text variable="number"/>, the hyphen is converted to an en-dash (or other range delimiter). This is incorrect behavior. I'm thinking the best approach would be to not apply any punctuation conversion to the number field.

What do you think @fbennett @adam3smith @retorquere ?

I treat number mostly as an opaque string across the board, the exception being patent entries.

Yeah, that would be best in my opinion. @fbennett @retorquere do you think one of you could make that change in citeproc-js?

Thanks for analyzing that! I know you've put a lot of thought into this over the years, and I really appreciate the inclusion of the override syntax -- and \- in citeproc-js to force or prevent en dash conversion. The override syntax means that there is a path for users to get a different behavior from the default and the question is what should the default be for number conversion or not?

I think this is more a question of data usage of the number variable versus style expectations—what is number used for and how is it presented in databases that import into CSL tools like Zotero?

The most common usage of number in databases is as a verbatim identifier (eg, report numbers, preprint repository IDs, URNs, etc). It is rarely used to reflect sequential numbering in databases (eg, journal article numbers are usually just stored as page unless a user specifically edits their data after import to a client).

Related to that, the most common type used with number in styles the repository is report, which I think supports the interpretation that most cases of number are identifiers, not sequences.

(Whether number should ironically be a non-number variable is a separate question.)

In sum, if number is primarily used as an identifier, I think defaulting to not transforming hyphens to en dashes and leaving the override syntax active makes sense.

the override syntax -- and \- in citeproc-js

I didn't know about this. Does anyone have a sample of this?

"pass on verbatim" in the case of BBT also means that --- gets rendered as an M-dash.