[BUG] export of bibtex has wrong encoding of umlaut and other special characters
karliwalti opened this issue · comments
issue:
special characters are not correctly encoden when a bibtex file is generated for export
example
current bib export:
@Inbook { Garatva2023,
author = {Garatva, Patricia and Terhorst, Yannik and Me{\{\dq}s}ner, Eva-Maria and Karlen, Walter and Pryss, R{{\dq}u}diger and Baumeister, Harald},
title = {Smart Sensors for Health Research and Improvement},
year = {2023},
DOI = {10.1007/978-3-030-98546-2_23},
booktitle = {Digital Phenotyping and Mobile Sensing.},
publisher = {Springer International Publishing},
address = {Cham},
series = {Studies in Neuroscience, Psychology and Behavioral Economics},
editor = {Montag, Christian and Baumeister, Harald},
pages = {395--411},
file_url = {https://doi.org/10.1007/978-3-030-98546-2_23}
}
desired format:
@Inbook { Garatva2023,
author = {Garatva, Patricia and Terhorst, Yannik and Me{\"s}ner, Eva-Maria and Karlen, Walter and Pryss, R{\"u}diger and Baumeister, Harald},
title = {Smart Sensors for Health Research and Improvement},
year = {2023},
DOI = {10.1007/978-3-030-98546-2_23},
booktitle = {Digital Phenotyping and Mobile Sensing.},
publisher = {Springer International Publishing},
address = {Cham},
series = {Studies in Neuroscience, Psychology and Behavioral Economics},
editor = {Montag, Christian and Baumeister, Harald},
pages = {395--411},
file_url = {https://doi.org/10.1007/978-3-030-98546-2_23}
}
current xml export:
<reference>
<bibtype>inbook</bibtype>
<citeid>Garatva2023</citeid>
<title>Smart Sensors for Health Research and Improvement</title>
<year>2023</year>
<isbn>978-3-030-98546-2</isbn>
<DOI>10.1007/978-3-030-98546-2_23</DOI>
<booktitle>Digital Phenotyping and Mobile Sensing.</booktitle>
<publisher>Springer International Publishing</publisher>
<address>Cham</address>
<series>Studies in Neuroscience, Psychology and Behavioral Economics</series>
<editor>Montag, Christian and Baumeister, Harald</editor>
<pages>395--411</pages>
<file_url>https://doi.org/10.1007/978-3-030-98546-2_23</file_url>
<authors>
<person>
<fn>Patricia</fn>
<sn>Garatva</sn>
</person>
<person>
<fn>Yannik</fn>
<sn>Terhorst</sn>
</person>
<person>
<fn>Eva-Maria</fn>
<sn>Meßner</sn>
</person>
<person>
<fn>Walter</fn>
<sn>Karlen</sn>
</person>
<person>
<fn>R{"u}diger</fn>
<sn>Pryss</sn>
</person>
<person>
<fn>Harald</fn>
<sn>Baumeister</sn>
</person>
</authors>
</reference>
desired behavior:
export leads to a proper encoding such as it can be imported again with importer (reversible) or used with other programs
solution:
I see two possible reasons and approaches:
- The data is encoded in the wrong order or too often . looks like the " is encoded as \dq after it was already decoded to {"u}. In this case chars after \ (or within curly brackets) should not be encoded another time.
- The original string already contains the encoded character and it should not be encoded at all
Based on above xml export, it could be actually the combination of both, since the ü seemed to be encoded in db.
In fact, this would be also a bug in the xml export where the encoding should be removed before export.