spdx / ntia-conformance-checker

Check SPDX SBOM for NTIA minimum elements

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

licenseId is not a case-sensitive as per spdx ,but this tool follows case-sensitve?

Bollineni-nitish-kumar opened this issue · comments

Firstly,thanks for the tool,it greatly helps.

As per spdx documentation licenseId is not a case-sensitive. link

Actually ,in the "hasExtractedLicensingInfos" element we will add all the extracted LicenseId's .
I have two license id's i.e (LicenseRef-New-BSD-license) and the other is (LicenseRef-New-BSD-License).Here with case-insensitive both are same licenseId's only.so added only one id (i.e LicenseRef-New-BSD-License) to the "hasExtractedLicensingInfos".

Then ,when i check with ntia tool,it throws error saying "Unrecognized license reference: LicenseRef-New-BSD-license. license_expression must only use IDs from the license list or extracted licensing info, but is: LicenseRef-New-BSD-license".

I validated sbom with spdx tool with having only one id,it shows as a valid document.I am adding screeshots below.

Screenshot from 2023-11-30 18-48-29

Screenshot from 2023-11-30 18-46-41

Thank you for the bug report, @Bollineni-nitish-kumar! We'll investigate.

In the meantime, PRs welcome, from you or anyone.

cc @goneall

@Bollineni-nitish-kumar: I realize this may not be practical, but are you willing to share the SBOM you were analyzing to help me reproduce and potentially fix this issue?

Changing the case of the license does seem to matter. See PR #166. I don't think this has to do with the code in ntia-conformance-checker though. I believe the spdx-tools library underpinning ntia-conformance-checker is responsible for this logic.

@zvr: Any thoughts?

Also, I noticed this statement in the SPDX link related to licenses:

However, please be aware that it is often important to match with the case of the canonical identifier on the SPDX License List. This is because the canonical identifier's case is used in the URL of the license's or exception's entry on the List, and because the canonical identifier is translated to a URI in RDF documents.

So perhaps the case should matter here. I admit that I am confused.

@jspeed-meyers :
Heyy,I may not give you the exact sbom as it belongs to organisation.I have created a sample to create this behaviour.

In the attached sbom sample,I have added two licenseId's i.e "LicenseRef-The-BSD-2-Clause-license" and "LicenseRef-The-BSD-2-Clause-License" in "hasExtractedLicensingInfos".In both licenseIds only difference is "l" is small case in first license and "L" is uppercase in second license in the last word "License".

With this sample ntia-checker doesn't throws any error.But with spdx tool throws exception saying license already exists.If you just remove one of the license from "hasExtractedLicensingInfos" ,then ntia-chcker tool will throws exception but spdx tool validates the document.

Sample spdx sbom :
Samplesbomspdxntia.json

Screenshot from 2023-12-01 11-20-38
Screenshot from 2023-12-01 11-21-16

@goneall @jspeed-meyers I'd love to take up this issue but i might need some help.

@puneeth072003: I'd be glad for some help. Thank you! 🙏

And asking questions is fine. I'll learn and debug along with you. Again, thank you!

@jspeed-meyers My thoughts, Changing this

license_ref_ids: List[str] = [license_ref.license_id for license_ref in document.extracted_licensing_info]

to

license_ref_ids: List[str] = [license_ref.license_id.lower() for license_ref in document.extracted_licensing_info]

in https://github.com/spdx/tools-python/blob/main/src/spdx_tools/spdx/validation/license_expression_validator.py (line no. 43)
might fix this bug

testing this now

@puneeth072003: Nice find. That looks promising.

@meretp: You and your fellow maintainers are the experts on spdx-tools. What do you think about @puneeth072003's thoughts above?

After making couple of changes in src/spdx_tools/spdx/validation/license_expression_validator.py,
I was able to fix the errors and got the below output after validating the sample json file provided by @Bollineni-nitish-kumar

Is this SBOM NTIA minimum element conformant? False

Individual elements                            | Status
-------------------------------------------------------
All component names provided?                  | True
All component versions provided?               | True
All component identifiers provided?            | True
All component suppliers provided?              | False
SBOM author name provided?                     | True
SBOM creation timestamp provided?              | True
Dependency relationships provided?             | True

@jspeed-meyers What do you think about this output obtained?

@puneeth072003, first, thank you for putting in a PR to tools-python. I'm impressed! I think we should wait for a response there. But assuming they merge that, we can then wait until tools-python cuts a new version and then ntia-conformance-checker can bump the version of tools-python it uses and then also cut a new release.

And this looks good to me. Again, nice work!

@jspeed-meyers You're welcome🤓

I am going to close this issue since it entirely involves changes to upstream code.