Leonidas-from-XIV / node-xml2js

XML to JavaScript object converter.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

non-breaking space trips up the parser

QAnders opened this issue · comments

I have the following function:

// Convert from XML (string) ->> JSON;
//   explicitArray: true, Make sure we have an Array on all elements, else elements with value only will be none Array
//   explicitCharkey: true, We always want data in ._ and attributes under .$
//   allowSurrogateChars: true, Allow unicode characters, e.g. emojies
const convertUblXmlToUblJson = async (xmlString, options = { explicitArray: true, explicitCharkey: true, allowSurrogateChars: true }, stripPrefix = false) => {
  if (stripPrefix) {
    const stripPrefixProcessor = xml2js.processors.stripPrefix;
    // eslint-disable-next-line no-param-reassign
    options = {
      ...options,
      ...{
        tagNameProcessors: [stripPrefixProcessor],
        attrNameProcessors: [stripPrefixProcessor]
      }
    };
  }
  const parser = new xml2js.Parser(options);
  return parser.parseStringPromise(xmlString);
};

allowSurrogateChars works nicely for e.g. emoji data, but I run into an error for non-breaking space in XML data... Invalid character in ...

The XML data is like:

<cbc:Name> </cbc:Name>

Where the "space" is HEX C2 0A, or unicode \u00A0, hence a non-breaking space (&nbsp; in HTML).

This is valid UTF-8 obviously so why is it crashing the parser?