pdvrieze / xmlutil

XML Serialization library for Kotlin

Home Page:https://pdvrieze.github.io/xmlutil/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add jvm parser configuration

e13mort opened this issue · comments

Hello. I use your library in one of my projects and have faced with an interesting issue.

In my case I have files with DTD declaration with wrong Url. I don't manage these files so can't remove that declaration manually. So, that incorrect declaration leads to normal work at jvm(android), native and js targets. But, plain jvm (javax.xml) fails with a valid reason The markup declarations contained or pointed to by the document type declaration must be well-formed. In that particular case I'm allowed to ignore that check.

The problem there:

nl.adaptivity.xmlutil.StAXReader.Companion#safeInputFactory has the following code:

return XMLInputFactory.newFactory().apply {
          setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false)
     }

I can't pass any additional properties to that factory. In my case I need to set property javax.xml.stream.supportDTD to false (and it works via debugger).

I see few ways how it might be implemented:

  • create a high-level property somewhere in the nl.adaptivity.xmlutil.serialization.XmlConfig.Builder (I guess). It's gonna blow API a little but will be suitable for my case and kind of easy to implement.
  • Simply disable DTD check in the code above. It's gonna be a downgrade to jvm target but will make the whole library more consistent (as far as it's already not working on other platforms).
  • Add a feature to fully configure platform implementations. Looks like the right way in general but will require more work.

What do you think?

There are a number of potential solutions:

  • By far the easiest, use defaultToGenericParser (this is a new-ish option) this will use the generic parser by default rather than using StAX (or other implementations) - this should also be faster.
  • Create your own pull parser using stax with all the needed configuration. Then create the StAXReader directly with that parser. (this would allow you to set the properties directly). Note that the xmlutil library use existing StAX libraries provided either by the JDK, or additionally using service loaders (eg. woodstox), they are not managed/maintained by me at all. This is the way in which platform implementations are intended to be configured if the defaults are not sufficient.

The factories are effectively shortcuts that are not intended to preclude more complex configurations. However, if you use serialization that may be a bit trickier (you may want to create your own wrapper function that uses decodeFromReader)

Btw. the reason for ignoring the external entities is that this is a potential security issue.

defaultToGenericParser did help indeed. It forced me to update to latest kotlin as well, but that's kind of good thing :)

I agree that external entities might lead to some sort of security issues. I wasn't expecting any network communication by default at all (that led to failure in my case with incorrect url). But, looks like it's a basic behaviour for StAX and maybe disabling javax.xml.stream.supportDTD property in addition to IS_SUPPORTING_EXTERNAL_ENTITIES might be more consistent in that way you intended. But it's up to you to decide.

Anyway, thank you for your assistance. I really appreciate your work here.

I think this issue can be closed.