tingletech / datel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Datel XML Transliterator

Pure python 99.9% losless transformations of XML to python dicts, JSON, or JSON Lines.

Datel Data Element and Record Scheme

Input XML with your records and an XPath pointing to the root element of records to extract.

The first step is to apply the XPath to the XML Document, iterating over each record.

For each record, the next step is to iterate over each record node, creating an array for each record containing Datel Data Elements. Then, the datel elements are merged into one dict per record.

The command outputs JSON Lines.

The default format creates one dict per record, supplying the flag --no-solsource will enabling an order-preserving mode with one dict per XML Element in the source.

install

pip install https://github.com/tingletech/datel/archive/refs/heads/main.tar.gz

notes

Command line use:

datel -h

Library use and format specification:

pydoc datel

for example:

./datel.py ./test/oai.xml ".//{http://www.openarchives.org/OAI/2.0/}metadata/.." | jq . - | more

outputs jsonl -- one line per record matched by the user supplied XPath

for the test/oai.xml file; "//{http://www.openarchives.org/OAI/2.0/oai_dc/}dc/../.." and "//{http://www.openarchives.org/OAI/2.0/}metadata/.." return the same thing

About


Languages

Language:Python 99.0%Language:Shell 1.0%