Official Apache-1.1 license text is not being matched correctly by LicenseCompareHelper.matchingStandardLicenseIdsWithinText()
pmonks opened this issue · comments
When org.spdx.utility.compare.LicenseCompareHelper.matchingStandardLicenseIdsWithinText()
is run on the official Apache-1.1 license text, it fails to find any matches, and I believe I've narrowed down the problem to the Clause5
alternative text tag in the template; if I remove the example header from the license text, and run org.spdx.utility.compare.LicenseCompareHelper.isTextStandardLicense().getDifferenceMessage()
on it, I get:
Variable text rule combined-bullet-Clause5 did not match the compare text starting at line #31 column #1 "5" while processing rule var: combined-bullet-Clause5
When I manually converted that <alt>
tag into a Java regex, and bullet 5 from the Apache 1.1 license text is manually cleansed of comment characters and newlines, I do get a match, so I'm pretty confident the problem is in the library rather than the template. Beyond that I'm not really sure what the root cause might be - whether it has to do with comment character handling, regexification of that particular <alt>
tag, or something else entirely.
This was reproduced with Spdx-Java-Library v1.11 and SPDX license list v3.23.
It it's helpful, I'm also seeing similar failures with the official Apache-1.0 license text too, though I haven't troubleshooted that to the same level of detail is I did with Apache-1.1.