find_replace transforms on XPath with predicates does not work
leboff opened this issue · comments
Describe the bug
When using the find replace transform with an XPath that includes predicates it fails to locate the proper element because the referenced tag in the predicate is not namespaced.
This test in test_transforms.py passes because the test XML included is not namespaced however it fails if it were re-written as the following (note the added xmlns to bookstore):
def test_xpath_replace_with_exp_and_index_has_xmlns(task_context):
zip_content = {
Path(
"Foo.xml"
): '<bookstore xmlns="foobar"> <book category="cooking"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="children"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="web"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="web"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>',
}
patterns = [
{"xpath": "/bookstore/book[price>40]/author[2]", "replace": "Rich Author"}
]
builder = create_builder(task_context, zip_content, patterns)
modified_zip_content = {
Path(
"Foo.xml"
): '<bookstore xmlns="foobar"> <book category="cooking"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="children"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="web"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Rich Author</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="web"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>',
}
zip_assert(builder, modified_zip_content)
The transformed xpath ends up looking like /*[local-name()="bookstore"]/*[local-name()="book"][price>40]/*[local-name()="author"][2]
note how price here does not reference local-name.
I think something that could work here is instead of parsing the xpath to add local-name references, storing predicates and re-adding them, the process can be simplified significantly (and made a bit more robust) by simply removing the xmlns declaration and then adding it back at the end.
Reproduction steps
- Create a deploy task with find_replace tag
options:
transforms:
- transform: find_replace
options:
patterns:
- xpath: /DuplicateRule[masterLabel="SomeRule"]/sortOrder
replace: 2
- Ensure xml has an xmlns declaration
- XPath does not correctly resolve
Your CumulusCI and Python versions
CumulusCI version: 3.85.0
Python version: 3.11.7
Operating System
Mac OSX 14.4
Windows environment
No response
CumulusCI installation method
None
Error Gist
No response
Additional information
No response