conflicting namespace prefixes during ListRecords
bertsky opened this issue · comments
If you do a harvest during which the same prefix will be seen with different URL targets, metha-sync will jumble the prefix – suffixing it by 1
but never declaring that renamed prefix, so the resulting XMLs become invalid.
For example, if I do
metha-sync -format mets -set 17th-century-prints http://digital.slub-dresden.de/oai/
then (because in our MODS the namespace for the extension slub
has been changed some time ago and now appears in some records with declaration http://www.slub-dresden.de/namespace
but with http://www.slub-dresden.de/
in others) I end up with altered and non-wellformed METS files. For example in oai:de:slub-dresden:db:id-1840307358
, instead of…
<mods:extension>
<slub:slub>
<slub:id type="digital">1840307358</slub1:id>
<slub:id type="source">113051157X</slub1:id>
<slub:id type="tsl-ats">Mercgeovg</slub1:id>
</slub:slub>
</mods:extension>
<mods:recordInfo>
<mods:recordIdentifier source="http://digital.slub-dresden.de/oai/">oai:de:slub-dresden:db:id-1840307358</mods:recordIdentifier>
</mods:recordInfo>
…(which is what you get for a single GetRecord request) I now see…
<mods:extension>
<slub1:slub>
<slub1:id type="digital">1840307358</slub1:id>
<slub1:id type="source">113051157X</slub1:id>
<slub1:id type="tsl-ats">Mercgeovg</slub1:id>
</slub1:slub>
</mods:extension>
<mods:recordInfo>
<mods:recordIdentifier source="http://digital.slub-dresden.de/oai/">oai:de:slub-dresden:db:id-1840307358</mods:recordIdentifier>
</mods:recordInfo>
…(which is invalid, because slub1
has never been introduced).
Thanks for the detailed bug report - that's certainly an interesting issue and I'll try to take a look at it shortly - it may also be some issue in the stdlib, as per golang/go #48641.
I'm afraid this is a Go stdlib XML issue first, cf. golang/go#13400.
But then, metha is mostly concerned with the envelope and that should be much less problematic. This will requires some internal rewrite and may take a while before it is released, just as a heads up.