nashwaan / xml-js

Converter utility between XML text and Javascript object / JSON text.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Processing Instructions are ignored

Arlen22 opened this issue · comments

I have an XML document that includes a processing instruction. But when I run it through xml2js, the processing instruction gets left behind.

<?xml version="1.0"?>
<?myapi version="13.0"?>

And here is a quick modification that I threw together in xml2js.js

function onDeclaration(declaration) {
    if (options.ignoreDeclaration) {
        return;
    }
    var key = options.declarationKey;
    var type = "decl";
    if (currentElement[key]) {
        key = options.processingKey;
        currentElement[key] = [];
        type = "proc";
    } else {
        currentElement[key] = {};
    }
    var item = {};
    item[options.nameKey] = declaration.name;
    while (declaration.body) {
        var attribute = declaration.body.match(/([\w:-]+)\s*=\s*"([^"]*)"|'([^']*)'|(\w+)\s*/);
        if (!attribute) {
            break;
        }
        if (!item[options.attributesKey]) {
            item[options.attributesKey] = {};
        }
        item[options.attributesKey][attribute[1]] = attribute[2];
        declaration.body = declaration.body.slice(attribute[0].length); // advance the string
    }
    if (type === "proc") currentElement[key].push(item);
    else currentElement[key] = item;
    if (options.addParent) {
        currentElement[options.declarationKey][options.parentKey] = currentElement;
    }
}

Hi @Arlen22 , thanks for reporting this issue. Can you provide more examples and the expected output, please?

@Arlen22
XML specifies that processing instruction contains just text not necessarily attributes.

This means both of the following are valid Processing Instructions:

<?stylesheet some random text?>

Output json will be (in compact-form):

{"_instruction": {"stylesheet": "some random text"}}

and

<?stylesheet href="style.css"?>

Output json will be (in compact-form):

{"_instruction": {"stylesheet": "href=\"style.css\""}}

Note that "_attributes" will be not generated.

OK, well I guess that could work. It would be nice if the parser could parse any valid attributes found, and also include the text, since it has the capability built in. Or is that not the intent of this library? Could it be added as a helper method?

Could we do "stylesheet": { _attributes: {}, _value: "" }?

But what you wrote above will be enough if you don't want to add this. Because then at least it will not get lost.

@Arlen22

"stylesheet": { "_attributes": {}, "_value": "" }}

is a good solution, but I am thinking of adding an option like {instructionHasAttributes: true} which will parse content of the Processing Instruction as attributes. This mean the output will be:

{"_instruction": {"stylesheet": {"href": "style.css"}}}
{"_instruction": {"stylesheet": {"_attributes": {"href": "style.css"}}}}   // or this?

otherwise if the flag is not set, then we get this:

{"_instruction": {"stylesheet": "href=\"style.css\""}}

Published v1.3.2 which supports processing instruction and {instructionHasAttributes: true} flag.

converting this xml:

<?go to="there"?>

will produce in compact mode:

{"_instruction":{"go":"to=\"there\""}}

in non-compact (expanded) mode:

{"elements":[{"type":"instruction", "name":"go", "instruction":"to=\"there\""}]}

and when {instructionHasAttributes: true} flag is set, it will produce the following in compact mode:

{"_instruction":{"go":{"_attributes":{"to":"there"}}}}

and this in non-compact (expanded) mode:

{"elements":[{"type":"instruction", "name":"go", "attributes":{"to":"there"}}]}

@Arlen22 Please let me know if this works for you or not.

Seems to be working well. Thank you :)