`process_xml` doesn't correctly parse `icon` in all cases
cutz opened this issue · comments
I've run into an issue where certain xml structure that contains icon
and cartridge_icon
does not parse correctly. Take for example the xml from the recently added test case here.
>>> CC_LTI_OPTIONAL_PARAMS_XML = b'''<?xml version="1.0" encoding="UTF-8"?>
... <cartridge_basiclti_link xmlns:blti="http://www.imsglobal.org/xsd/imsbasiclti_v1p0" xmlns:lticm="http://www.imsglobal.org/xsd/imslticm_v1p0" xmlns:lticp="http://www.imsglobal.org/xsd/imslticp_v1p0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.imsglobal.org/xsd/imslticc_v1p0" xsi:schemaLocation="http://www.imsglobal.org/xsd/imslticc_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imslticc_v1p0.xsd http://www.imsglobal.org/xsd/imsbasiclti_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imsbasiclti_v1p0p1.xsd http://www.imsglobal.org/xsd/imslticm_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imslticm_v1p0.xsd http://www.imsglobal.org/xsd/imslticp_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imslticp_v1p0.xsd">
... <blti:title>Test config</blti:title>
... <blti:description/>
... <blti:launch_url>http://www.example.com</blti:launch_url>
... <blti:secure_launch_url>http://www.example.com</blti:secure_launch_url>
... <blti:icon>http://wil.to/_/beardslap.gif</blti:icon>
... <blti:vendor/>
... <cartridge_icon identifierref="BLTI001_Icon"/>
... </cartridge_basiclti_link>
... '''
>>> from lti.tool_config import ToolConfig
>>> config = ToolConfig.create_from_xml(CC_LTI_OPTIONAL_PARAMS_XML)
>>> config.icon
>>> config.icon is None
True
>>>
The issue appears to be the use of in
used to check tag names of xml. Presumably this is used to get around the fact that tag names include the namespace information.
>>> icon_node.tag
'{http://www.imsglobal.org/xsd/imsbasiclti_v1p0}icon'
>>>
This leads to issues when tag names are substrings of other tag names. For example in the case of icon
and cartridge_icon
the config's icon is correctly set when it encounters the icon
tag, but then overwritten when parse_xml
encounters the cartridge_icon
tag.
As this is potentially a larger change I thought I would raise an issue here first. I would like to propose the use of lxml.etree.QName to check exact matches of the local tag name. The existing tag name checks in process_xml
would be changed to something like:
from lxml.etree import QName
def _is_tag(node, name):
return QName(node).localname == name
...
if _is_tag(child, 'icon'):
self.icon = child.text
...
There is some evidence that this has been accounted for in the past using a serious of if/elif blocks. e.g.
if 'secure_launch_url' in child.tag:
self.secure_launch_url = child.text
elif 'launch_url' in child.tag:
self.launch_url = child.text
That approach could be taken here as well, but it seems like this may warrant a more general solution. I'm happy to provide a PR for this change if you would like to move forward.
Your suggestion seems reasonable to me, so I'm happy to review PRs to this effect. It'll take me a bit longer to review because I'm not all that familiar with the XML libraries (most of this code was written by other authors). But a more correct check of the tag name seems entirely appropriate to me.
Thanks for the report, and a PR would be most welcome!
There is also secure_icon
tag (<blti:secure_icon>
). It's also not produced by to_xml()
method, and it's not tested anywhere in tests (but it should be parsed by process_xml()
method). I was planning to implement it shortly, will secure_icon
behave similarly to icon
when XML is parsed?
It makes sense to me that it would fall in the same trap that secure_launch_url
falls into. If you're not wanting to take on this bit, I suppose that you could add it with the careful ordering the same as was described for the launch url.
It seems that secure_icon
is parsed fine:
>>> CC_LTI_OPTIONAL_PARAMS_XML = b'''<?xml version="1.0" encoding="UTF-8"?>
... <cartridge_basiclti_link xmlns:blti="http://www.imsglobal.org/xsd/imsbasiclti_v1p0" xmlns:lticm="http://www.imsglobal.org/xsd/imslticm_v1p0" xmlns:lticp="http://www.imsglobal.org/xsd/imslticp_v1p0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.imsglobal.org/xsd/imslticc_v1p0" xsi:schemaLocation="http://www.imsglobal.org/xsd/imslticc_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imslticc_v1p0.xsd http://www.imsglobal.org/xsd/imsbasiclti_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imsbasiclti_v1p0p1.xsd http://www.imsglobal.org/xsd/imslticm_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imslticm_v1p0.xsd http://www.imsglobal.org/xsd/imslticp_v1p0 http://www.imsglobal.org/xsd/lti/ltiv1p0/imslticp_v1p0.xsd">
... <blti:title>Test config</blti:title>
... <blti:description/>
... <blti:launch_url>http://www.example.com</blti:launch_url>
... <blti:secure_launch_url>http://www.example.com</blti:secure_launch_url>
... <blti:icon>http://wil.to/_/beardslap.gif</blti:icon>
... <blti:secure_icon>https://wil.to/_/beardslap.gif</blti:secure_icon>
... <blti:vendor/>
... <cartridge_icon identifierref="BLTI001_Icon"/>
... </cartridge_basiclti_link>
... '''
>>> config = ToolConfig.create_from_xml(CC_LTI_OPTIONAL_PARAMS_XML)
>>>
>>> config.secure_icon
'https://wil.to/_/beardslap.gif'
I planned to add it just after the icon
tag.
Oh, well for the serialization to XML the order shouldn't matter, so after should be fine there. It's the fuzzy in
matching when parsing that can cause issues. If it's working already for parsing, then cool!
It’s the the tag order dependent parsing in process_xml due to substring based tag matching that I’m running in to. I should be able to get to a PR as proposed in the original post in an hour or two.