Official GPL-2.0 license text not recognized
sdheh opened this issue · comments
For the license text https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt I get the following:
System.out.println(Arrays.toString(LicenseCompareHelper.matchingStandardLicenseIds(licenseText)));
System.out.println(LicenseCompareHelper.matchingStandardLicenseIdsWithinText(licenseText));
outputs
[]
[GPL-2.0, GPL-2.0-or-later, GPL-2.0-only]
The two outputs should be the same since the GPL-2.0 license spans the whole file.
Tested with version 1.1.11
This problem is similar to 217
I figured out a problem that could explain this case. I think the tokenization does not work properly.
Example:
String license1 = "<one";
String template1 = "<<beginOptional>><<<endOptional>>one";
String license2 = "< one";
System.out.println("template1, license1: " + LicenseCompareHelper.isTextMatchingTemplate(template1, license1).getDifferenceMessage());
System.out.println("template1, license2: " + LicenseCompareHelper.isTextMatchingTemplate(template1, license2).getDifferenceMessage());
Returns
template1, license1: Normal text of license does not match at end of text when comparing to template text "one
". Last optional text was not found due to the optional difference:
Normal text of license does not match at end of text when comparing to template text "<"
template1, license2: No difference found
When I debug I see that for the first case in org.spdx.utility.compare.CompareTemplateOutputHandler.compareText
the matchTokens
parameter is ["<one"]. I think it should instead be ["<", "one"] like in the second case.
Also if I remove all <
and >
from the https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt text (
gpl-2.0-removed-angle-brackets.txt
) or if I add a space before and after every <
and >
(
gpl-2.0-spaces-between-angle-brackets-and-text.txt
) I get the following result for the code in the issue description:
[GPL-2.0, GPL-2.0-only]
[GPL-2.0, GPL-2.0-or-later, GPL-2.0-only]
Thanks @sdheh for the analysis! I agree, the tokenization is the issue. I'm still working on the 3.0 update, so I won't have much time over the next week or so to look for a fix, but if you want to create a pull request I can review / merge.