pabigot / pyxb

Python XML Schema Bindings

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Specializing anyType for a specific binding

rvalle opened this issue · comments

Hi!

I would like to subclass anyType for one concrete binding.

This particular XSD type, allows for an anyType node, the particular way in which it is used turns out to be a simple, free form, key value pair node.

I would like to subsclass anyType and implement a tailored access method. Perhaps a .toDict() instead of using .toDOM().

Is there any way this can be done with PyXB?

You can probably do something with binding customization, by customizing the interaction with whatever element has that restricted content.

That actually seems like a very cool feature of PyXB!
So for the following element:

<xs:element name="TEMPLATE" type="xs:anyType"/>

I would be specializing the class generated for TEMPLATE, instead of anyType itself, right?
All anyTypes in my XSD types behave in this way. That's why I thought of it in the first place.

I guess I can create the new behaviour as a class, and use multiple inheritance everywhere in where this pattern in used in my XSD generated entities.

I think I face two problems:
TEMPLATE seems to be generated as something like a base type:
its class is pyxb.binding.datatypes.anyType
I think that means I have to customize the parent element.
But then the problem that I see is that the class is being generated, in this particular case, as:

pyone.bindings.raw.CTD_ANON_58

I think this is down to the XSD not naming the Types, and theoretically that XSD is not my deliverable.
I wonder, if I extend this CTD_ANON_58, whether several compilcations will render a consistent result.

For an element containing:

<xs:element name="TEMPLATE" type="xs:anyType"/>

you would specialize the containing element, and in particular how it handles the TEMPLATE member.

Or something like that. If the containing element is an unnamed complex type this probably won't be feasible.

Ok, I see.

It would work if the class names for anonymous elements where deterministic and all compilations return the same one. Or If there are no anonymous types.

A sample XSD would be, for example this one:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://opennebula.org/XMLSchema" elementFormDefault="qualified" targetNamespace="http://opennebula.org/XMLSchema">
  <xs:element name="HOST">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="ID" type="xs:integer"/>
        <xs:element name="NAME" type="xs:string"/>
        <xs:element name="TEMPLATE" type="xs:anyType"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

I can try to NAME the complexTypes there are across the XSD even if its not my deliverable. I could use an XSLT before generating the bindings, but then I run into another problem:

Name attribute invalid on non-global complex types

I can however give them an ID instead, but I notice that the ID is not been used to generate the class name.

Customizing bindings that involve complex anonymous types would indeed be very useful.
I am willing to think a patch so that this is possible.
The current system is fine, if the CT_ANON_NN were deterministic, and for a given XSD the generated NN was always the same.

Just a random idea, but we could use hashing to generate the NN. For example, in my case the element xpath is going to be HOST/TEMPLATE, we could do NN=md5(HOST/TEMPLATE)

Another random idea would be to use some kind of introspection in the build process, and find out those NNs.

Maybe you have a better idea of how this could be achieved.

I'm afraid this enhancement doesn't particularly interest me personally, so I can't provide much support towards completing it. I personally haven't used XML or Python for anything in about five years, so unless somebody wants to fund new development I'm supporting PyXB at a low level of effort for critical maintenance only.

At one point I'd considered defining a namespace for attributes that could be used to annotate a schema and influence the code generation, but that never happened and any design notes for it were probably lost when SourceForge dropped support for trac where the enhancement would have been recorded. I think that would be more in the spirit of XML than attempting to make randomly generated names persistent.

You're welcome to make an attempt at providing this capability in whatever form you choose. If the solution seems complete and unlikely to cause problems I'll merge it; if not you can provide and maintain a fork of PyXB that adds it.

I totally understand your position.

I like the idea of the namespace to influence code generation, however in my particular case I chose PyXB to integrate to a third party product whose XMLRCP API changes frequently.

I was hoping to update the XSDs on every update, and pass the tests to check that nothing broke.

The downside of using the namespace would be that I would have to instrument the XSDs that other party generates. Perhaps the anotation process could be automated by an XSLT or something alike, saving the case.

However when the XSDs and bindings belong to the same party, I think the solution is ideal.