Analyzer does not allow to have multiple independent projects with the same type / name / version
sschuberth opened this issue · comments
See
ort/analyzer/src/main/kotlin/AnalyzerResultBuilder.kt
Lines 57 to 59 in 6fca267
The same occurs when analyzing e.g. https://github.com/aws/glide-for-redis.git as it contains multiple (independent) Cargo.toml
files wit the same content, like
$ head -5 go/Cargo.toml
[package]
name = "glide-rs"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
$ head -5 java/Cargo.toml
[package]
name = "glide-rs"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
@oss-review-toolkit/core-devs, how about if we simply add parent directory names as suffixes to the project name until the is unique?
A couple of questions:
Do you propose this as a general solution or specific to Cargo?
Why add the directory names as suffixes and not prefixes? That seems unintuitive. I would rather prefix them and always take the full path, as it could otherwise be confusing. So for the example above use the names java/glide-rs
and csharp/lib/glide-rs
.
Should this always happen or only if there are conflicting names?
Do you propose this as a general solution or specific to Cargo?
As a general solution, see also the PIP case mentioned in the quoted TODO.
Why add the directory names as suffixes and not prefixes?
Because at the Cargo example, I find glide-rs-go
/ glide-rs-java
to read nicer than go-glide-rs
/ java-glide-rs
. (I probably should have said that I envisioned dashes instead of slashes as separators.)
Should this always happen or only if there are conflicting names?
Probably yes, as otherwise names could get unnecessary complicated.
I kind of like this approach, however I'm not sure about the details. For example, this approach could be difficult for package managers that support project dependencies (e.g. Maven), because those references might break if we rename projects.
Could you maybe collect some more examples to show how the naming algorithm would work for repositories that are affected by this issue? That would be good input to further refine the idea.
Some insights here, are we aiming to a common global identification ?
Or if this is too much, maybe instead of dash could go to something like gradle representations:
glide-for-redis.go.glide-rs:0.1.0
Is this a little more logic considering that we have a better tracking from exact folder
I kind of like this approach, however I'm not sure about the details. For example, this approach could be difficult for package managers that support project dependencies (e.g. Maven), because those references might break if we rename projects.
IIRC in GoMod
it could be analog.