mathieucarbou / license-maven-plugin

Manage license headers in your source files

Home Page:https://oss.carbou.me/license-maven-plugin/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

design discussion: goal for per-dependency license enforcement

rremer opened this issue · comments

I have a need at my company which I think is somewhat common, and think this feature would fit well into the maven-license-plugin.

Feature:

  • take a java resource uri with a list of allowed package/version/regexes and licenses
  • optionally warn/list/fail a build on license matching against those packages

This should not conflict with the existing check goal which ensure that all your source files have your license, this is about ensuring your dependencies have licenses which don't put your project at legal risk.

CNCF already does something similar for Golang here , and I think SPDX may be an acceptable format, but what we serialize the data into doesn't matter so here as whether you think this would be a good fit to be a new goal for this plugin or not. What I foresee is just including some default of 'allow all permissive open source licenses and warn if your dependencies/transitives have proprietary ones', and allow organizations like mine to just provide a resource jar with our own special cases.

I'm happy to take on this work, just wanted to discuss design with you all before opening a big PR.

Hi @rremer !
This need already popped up in the past, not described as good as you are describing it right now but it was a similar need.
Of course this would be a nice addition to the plugin and if you are willing to take on the work I am opened for that.
I will be able to review PRs and also release as needed if your testing require to have some released artifacts from central.

Regarding the new goal... I don't know... I like the idea of having just one check goal. But to keep backward compatibility, we could add more options to the plugin to enable additional checks. I foresee that there could be more than 1 additional check that we could eventually enable and each new "check" could perhaps need some new parameters also.

I'll let you think about it :-)

Thanks for the help! Really appreciated!

Been musing, familiarizing myself with this project code, and getting a greenlight from my current employer to take a stab at this on-the-cluck; should be able to start coding next week, but will let you know either way @mathieucarbou (and thanks for your support/encouragement!).

I agree just having 'check' would be handy, and minimize how much plugin config folks would have to write to make use of new features like this dependency checking. By contrast, looking at how org.codehaus.mojo:license-maven-plugin was implemented, there's that initial hurdle of having to grok all the goals and configure each of them to do seemingly standard/simple checks.

In order to keep the code maintainable, but also have a simplified/default check goal, I'm thinking I'd like to have the best of both worlds: move the existing LicenseCheckMojo into its own SourceLicenseCheckMojo, this would allow me to write a new DependencyLicenseCheckMojo and then re-implement the original LicenseCheckMojo as essentially a meta executor. In this way, we could keep the simplicity of check without having a bloated/hard-to-test class. If folks really wanted to run the explicit goals of each, they could do that for fine-grained cases.

From a configuration standpoint, this is what I had in mind from a user's perspective:

<configuration>
  <dependencies>
    <licenses>
      <license>MIT</license>
      <license>ASL2.0</license>
    </licenses>
    <exceptions>
      <exception>
        <groupId>some.group.id</groupId>
        <artifactId>some-jar-name</artifactId>
        <!-- version is optional, and should conform to https://maven.apache.org/pom.html#Dependency_Version_Requirement_Specification -->
        <version>some-version-approved-by-3pp-explicitely</version>
      </exception>
    </exceptions>
  </dependencies>
</configuration>

honorable mentions

There are some existing plugins that have similar logic to the feature set I'm proposing, just wanted to list them here for posterity:

  • org.apache.maven:maven-project-info-reports-plugin aggregates the dependencies and lists their maven-declared License (this project actually uses it: https://code.mathieu.photography/license-maven-plugin/reports/3.0/dependencies.html ).
  • org.complykit:license-check-maven-plugin attempts to do license checks on dependencies, and has a very similar goal to what I need whereby you list allowed license and exceptions, but leaves something to be desired in its implementation (string parsing xml of the pom instead of using the mojo api...)
  • org.codehaus.mojo:license-maven-plugin has similar functionality to this com.mycila:license-maven-plugin, but a lot more goals for each of the explicit piece of functionality

Hello @rremer !

I have read your answer 2 days ago but forgot to answer. Sorry!

I agree with you that the hierarchy of the mojos right not does not fit what you need to do because the abstract class is heavily doing pre processing stuff related to license and scanning and passes the execution to the sub-plugins, either a check or format...

So yes indeed, to be able to reuse some of the parameters in the abstract class this would be a good idea to rename it and extract code in a new super class (or in a more specialized abstract class as you prefer).

If I understand your config correctly, you would define in the <dependencies> section the list of compatible licenses allowed for the dependencies, and then you could add some exceptions in case a dependency has a another license that is acceptable ?

Questions:

  • I guess you would need also to support some include / exclude sections with the commonly used pattern for dependencies ? (i.e. org.slf4j:*, org.slf4j:slf4j-api, etc) ?
  • what about dependency scopes ? How you would determine which dependency to include by default ? Would it be limited to runtime ones ? or compile ones ? what about this weird system scope also ?

Also, just some notes about the org.codehaus.mojo:license-maven-plugin: this plugin exists since years, but since 2019 it seems to have been taken over by @ppalaga, which has also contributed a lot in this maven license plugin.
And a lot new features have been added to Codehaus plugin consequently. This Mycila license plugin and the Codehaus plugin currently do not have the same set of features, but I think it could be possible that in the long run the Codehaus plugin gets more traction since under the Codehaus banner and obviously some of the developers can support as part of their job which is a high plus.

re: Abstract refactor, my thinking exactly
re: scope - interesting point. Yeah that should definitely be a configuration param, although provided would obviously not be an option, and system could be an option and just up to the dev to ensure that their CI or local had the same jars that they'd be running with. Obviously compile and test would be the only interesting and generally-useful options here. I'm thinking that compile AND test would be the defaults
re: splat/regex, I was actually thinking not to support complex regex here, as it erodes your confidence that you're letting dangerous licenses through. For complex usecases, it could be a handy syntactic sugar, but I think being explicit is safer (how would seeing an exception for com.* make you feel? Or how about com.oracle.*?).
re: codehaus - I was actually surprised to see new commits there, and it seems you had the opposite feeling about 'the Codehaus banner' than I did... I thought codehaus was sunsetted? At least, all the major plugins/libs I used had been ferried off to other orgs and there were several announcements about shutting down the website, etc.

And yes, your restated grok of my configuration section sounds right to me. I'm realizing that dependencies is a bad token to use here though, it looks to much like you would have in a plugin declaration or build section.

I am sceptical about the test scope... I.e. where I work, we are really checking licenses of what we put inside our distributed zip file (packaged application). But what is used for testing is less strict and we do not check licenses there, except for example in the case where a license would prevent commercial use. What I mean is that the license check for the distributed application might be usually more restrictive to the license check required for the test scope (if it is even required).

Maven already has a widely known and accepted pattern to declare dependencies and patterns. I would prefer that pattern gets reused because this is what users are used to. Note: I am taking about patterns to filter goupIds and artifactIds here, not patterns relative to imports. I think your example was about being able to filter dependencies.

For example with the maven assembly plugin you can have such syntax for the patterns. This is a known and supported pattern.

    <dependencySet>
      <includes>
        <include>org.terracotta.statistics:statistics</include>
        <include>com.fasterxml.jackson.datatype:*</include>
        <include>org.terracotta:*</include>
      </includes>
    </dependencySet>

For the test scope, we're just discussing the default here which I'm on the fence about. I don't think we disagree about it needing to be a configuration parameter in the mojo, and that it would need to be a list of scopes it took. I realized also that I think most folks care about runtime more than anything, it implies what your app/lib is shipping which is when you'd get into legal trouble (e.g. what is published).

For the exceptions configuration, I considered the single-line-per-exception approach. It is indeed common, I just personally think it's less legible while being more compact. Either syntax could support splat *. I'll support that either way, and it would just be up to the user whether to condone use of the syntax or not (similar to like you say package import splats, some folks do it and that's their choice). Internally, the list of dependencies to check will be its own function, so I don't mind if we change our minds on how to parse various syntaxes as we go. I'm uncertain what public apis maven exposes for parsing syntax like either of these, so maybe that could help inform your decision as to what I should implement.

For the test scope, we're just discussing the default here which I'm on the fence about. I don't think we disagree about it needing to be a configuration parameter in the mojo, and that it would need to be a list of scopes it took. I realized also that I think most folks care about runtime more than anything, it implies what your app/lib is shipping which is when you'd get into legal trouble (e.g. what is published).

Exactly 👍

For the exceptions configuration, I considered the single-line-per-exception approach. It is indeed common, I just personally think it's less legible while being more compact. Either syntax could support splat *. I'll support that either way, and it would just be up to the user whether to condone use of the syntax or not (similar to like you say package import splats, some folks do it and that's their choice). Internally, the list of dependencies to check will be its own function, so I don't mind if we change our minds on how to parse various syntaxes as we go. I'm uncertain what public apis maven exposes for parsing syntax like either of these, so maybe that could help inform your decision as to what I should implement.

I don't know either for the api: because I have only used them in xml. But if several plugins are supporting such pattern, this should be already in the api I guess. Probably this should be similar to the include exclude rules in this project that are using the maven api and ant-like pattern. All of this is part of the maven api. The goal is to not disrupt the user experience with a new concept people will have to learn when they already know something that exists and is already used "by convention" in several plugins :-)

I would perhaps have a look at the assembly plugin for filtering. They have a quite complex use case so all our answers should be there.