texworld / betterbib

:green_book: Command-line tools for bibliographies.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dashes

RMeli opened this issue · comments

I encountered several references where instead of using - in the title the following Unicode characters were used:

  • U+2013:
  • U+2212:

The presence of such characters makes compilation fail.

Should the .bib file be sanitized by transforming these two characters into - or --? (I think in titles the former is more appropriate)

In #137 is seems that there was something on those lines

value = value.replace("\u2010", "-")

but it's been removed?

I switched to a different document where I have been using betterbib for a while, added one reference, and run betterbib. Several instances of the problem described above started to appear (in references different to the one that was added).

Is it possible that something added/removed in the latest version is causing this problem? Or am I missing something obvious here?

As usual, and MWE is needed.

Sorry about that @nschloe . This is an entry for which it happens:

@article{Ragoza2017,
	journal = {J. Chem. Inf. Model.},
	number = {4},
	doi = {10.1021/acs.jcim.6b00740},
	year = {2017},
}

Using

betterbib update --doi-url-type short -t test.bib

I get the following entry:

@article{Ragoza2017,
	author = {Ragoza, Matthew and Hochuli, Joshua and Idrobo, Elisa and Sunseri, Jocelyn and Koes, David Ryan},
	journal = {J. Chem. Inf. Model.},
	number = {4},
	doi = {10.1021/acs.jcim.6b00740},
	year = {2017},
	source = {Crossref},
	url = {https://doi.org/10/f9zwhj},
	volume = {57},
	publisher = {American Chemical Society (ACS)},
	title = {{Protein–Ligand} Scoring with Convolutional Neural Networks},
	issn = {1549-9596, 1549-960X},
	pages = {942--957},
	month = apr,
}

The is U+2013, which makes LaTeX compilation fail. I did not encountered this problem with previous versions (but I don't know at which version it started appearing, sorry).


betterbib 4.2.2 [Python 3.9.10]
Copyright (c) 2013-2021 Nico Schlömer

Can you try the latest version?

I get the same with the latest version from pip

betterbib 4.3.5 [Python 3.9.10]
Copyright (c) 2013-2022 Nico Schlömer
%comment{This file was created with betterbib v4.3.5.}


@article{Ragoza2017,
	author = {Ragoza, Matthew and Hochuli, Joshua and Idrobo, Elisa and Sunseri, Jocelyn and Koes, David Ryan},
	journal = {J. Chem. Inf. Model.},
	number = {4},
	doi = {10.1021/acs.jcim.6b00740},
	year = {2017},
	source = {Crossref},
	url = {https://doi.org/10/f9zwhj},
	volume = {57},
	publisher = {American Chemical Society (ACS)},
	title = {{Protein–Ligand} Scoring with Convolutional Neural Networks},
	issn = {1549-9596, 1549-960X},
	pages = {942--957},
	month = apr,
}

Alright. This is about the dash in the title, right?

Yes, that's what causing the problem now. I have sanitised this bibliography before using betterbib without issue. In #137 it seems that the following line of code has been removed:

value = value.replace("\u2010", "-")

In this example it is U+2013, but I encountered the same issue with U+2010 and U+2212 as well.

I just saw #239, maybe the same can be applied to other fields as well?

This should now be fixed (4.3.6).

Thank you @nschloe !