Boldewyn / rm-ns

Remove elements of a certain namespace from XML data

Home Page:http://boldewyn.github.com/rm-ns

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

                                     rm-ns

                                remove namespace

These files provide possibilities to remove elements and attributes from an XML
document, that are in a certain, defined namespace.

Consider you have an XHTML document with math formulas in the MathML namespace.
For some reason, you want the formulas stripped out of the file. Nothing
simpler than that! Use the tools provided by this project and enjoy.

The package contains three different implementations, that all work the same
way. You can use any of the three XSLT, native Javascript or native Python
implementation independently.

P a r a m e t e r s :
=====================

* namespace: The namespace whose elements and attributes are to be removed.

* copy_children: If true, the children of matching elements will be copied.
  Otherwise, they will be erased, too. Default is 'true'.

* remove_attributes: This parameter controls, if attributes are removed, too.
  Attributes are per definitionem in the empty namespace, as long as they are
  not explicitly, that is, with a prefix, bound to a namespace. That means,
  that if elements in the empty namespace should be removed, most of the
  attributes of other elements will vanish as well. This parameter allows to
  address this problem. Default is 'true'.

B e h a v i o u r :
===================

* If the root element is bound to the namespace, that should be removed, the
  result could be invalid XML. Therefore in the XSLT version in this case the
  whole document is embedded in a new root element <root /> in the empty
  namespace and a warning message is issued.

H o w   t o   D e p l o y :
===========================

XSLT version:
  a) in Firefox: Add the following lines to your XML file:

     <?xslt-param name="namespace" value="urn:my-unwanted-namespace"?>
     <?xml-stylesheet type="text/xsl" href="rm-ns.xsl"?>

     (other browsers don't support <?xslt-param ?>, you have to touch
     rm-ns.xsl itself there.)

  b) via a command line XSLT processor:

     $ saxon -s:source.xml -xsl:rm-ns.xsl -o:out.xml \
       namespace='urn:my-unwanted-namespace'

     $ xalan -IN source.xml -XSL rm-ns.xsl -OUT out.xml -PARAM namespace \
       'urn:my-unwanted-namespace'

  c) inside PHP:

     <?php
     $xsl = new DOMDocument;
     $xsl->load('rm-ns.xsl');
     $proc = new XSLTProcessor;
     $proc->importStyleSheet($xsl);

     $xml = new DOMDocument;
     $xml->load('source.xml');
     $proc->setParameter('', 'namespace', 'urn:my-unwanted-namespace');
     $proc->transformToURI($xml, 'file:///tmp/output.xml');
     ?>

  d) in Python with libxml2 and libxslt bindings:

     #! /usr/bin/env python

     import libxml2, libxslt

     styledoc = libxml2.parseFile("rm-ns.xsl")
     style = libxslt.parseStylesheetDoc(styledoc)
     doc = libxml2.parseFile('source.xml')
     result = style.applyStylesheet(doc, {"namespace":
                                          "'urn:my-unwanted-namespace'"})

     out = open('output.xml', 'w')
     out.write(result.serialize())

     style.freeStylesheet()
     doc.freeDoc()
     result.freeDoc()
     out.close()


JS version:
  Add the following lines to your XML file:

  <script type="text/javascript" src="rm-ns.js"></script>
  <script type="text/javascript">remove_namespace("urn:my-unwanted-namespace");</script>

  Presto! Namespace removed.


Native Python version:

  >>> from rm_ns import remove_namespace
  >>> from xml.dom.minidom import parse
  >>> source = parse('source.xml')
  >>> remove_namespace("urn:my-unwanted-namespace", source)
  >>> out = open('output.xml', 'w')
  >>> out.write(source.toxml())
  >>> out.close()


L i c e n s e :
===============

The tools are published under an MIT-style license and the GPL v2. Choose at
your liking.

About

Remove elements of a certain namespace from XML data

http://boldewyn.github.com/rm-ns

License:GNU General Public License v2.0


Languages

Language:Python 58.8%Language:JavaScript 41.2%