plutext / docx4j

JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

Home Page:https://www.docx4java.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🚨 New namespace "w16du" was added in the upcoming office update

ameramayreh opened this issue · comments

We faced corrupted .docx file after modified by docx4j
The original file was produced by a beta version of MS Office
After comparing two identical files, one modified with the beta version of office, and the other one modified by the latest stable version, we noticed a new namespace "w16due".

We applied the same fix done in 8.2.9: 1acc819
By adding the mapping for "w16due" and "http://schemas.microsoft.com/office/word/2023/wordml/word16du" to org.docx4j.jaxb.NamespacePrefixMappings class, the issue was fixed.

Thanks

Will be in 8.3.10 and 11.4.10 releases.

Thanks @plutext , is there a reason why the namespace mapping is not dynamic? I mean read from the xmls and cached instead of hardcoding them?

To read the namespaces outside of what JAXB does itself, we'd have to parse the XML independently which would affect performance. These changes to @mc:Ignorable are comparatively rare, so trying to handle this dynamically has not been a priority.