oss-review-toolkit / ort

A suite of tools to automate software compliance checks.

Home Page:https://oss-review-toolkit.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

best solution to scan a project separately and combine results to a final report

ChenZhaobin opened this issue · comments

I'm exploring the usage of ORT (OSS Review Toolkit) for code scanning in my projects, and I have a specific scenario that I'd like some advice on.

Let's say I want to perform code scans on different parts of a project separately and then merge the final results. What would be the best approach for achieving this with ORT? Specifically, at which stage should the results be merged for optimal effectiveness, and how can this be accomplished? ie. how to deduplication.

Additionally, I often encounter GitHub timeout issues during the scanning process. What strategies or best practices would you recommend for handling these timeouts? Re-scanning the entire project can be time-consuming, and I'm concerned about potential dependencies that might arise if I use the curation mechinism.

Any insights or advice would be greatly appreciated. Thank you!

fyi, I also want to know how the version control system is worked , I can see it from logs that ,sometimes Could not fetch only revision, some times Could not find any revision candidates for package , some times Could not resolve revision for package , and how is it mapped among binary artifacts ,source artifacts and source repository.

Let's say I want to perform code scans on different parts of a project separately and then merge the final results. What would be the best approach for achieving this with ORT?

We've had some (oral) discussions about this quite some time ago in our ORT Community meetings in the context of this PR, but came to the (preliminary) conclusion that it's too hard to get right for all use-cases, and thus the idea to implement something like this stalled.

@ChenZhaobin please also stick to one topic per issue, and consider starting a discussion instead if you're not actually reporting a bug or requesting a feature.

understood, thanks for the quick help.

Re-scanning the entire project can be time-consuming

One idea is to scan packages one-by-one in advance to populate the scan storage (you should set up a database for that beforehand), so the "real" scan does through smoothly.

how is it mapped among binary artifacts ,source artifacts and source repository.

That depends a bit on your configuration of source code origins:

/**
* Configuration of the considered source code origins and their priority order. This must not be empty and not
* contain any duplicates.
*/
val sourceCodeOrigins: List<SourceCodeOrigin> = listOf(SourceCodeOrigin.VCS, SourceCodeOrigin.ARTIFACT)

In general, ORT does not download binary artifacts, but only source artifacts or source code from VCS. Which of the latter two (and in which order) is determined by the above setting, and any errors messages refer to getting code from that configured origin(s).