google / addlicense

A program which ensures source code files have copyright license headers by scanning directory patterns recursively

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

change style for java files from comment to javadoc

Fetsivalen opened this issue · comments

Hi,
From what I see for some reason license added to the .java files in comment style.
When addlicense tool supports the common style which is already applied to the ".c" files.

https://github.com/google/addlicense/blob/master/main.go#L216-L217
Would you accept the PR to change that style?

Example from Spring boot project (probably the most common case of the Java project)
https://github.com/spring-projects/spring-boot/blob/master/spring-boot-project/spring-boot-actuator-autoconfigure/src/main/java/org/springframework/boot/actuate/autoconfigure/OnEndpointElementCondition.java#L1

Adding some notes from my quick research. Google's Java style guide does not instruct the style of comments to use for copyright and license information, only that it comes at the beginning of the file: https://google.github.io/styleguide/javaguide.html#s3-source-file-structure

I randomly spot-checked a half dozen Google Java projects and found:

I didn't bother checking to see if projects were internally consistent with which style they used. I just checked the first java file I could find.

I also dug into the public GitHub repo dataset in BigQuery, and ran some analysis on the sample_contents table there:

Java files that contain // copyright (accounting for spaces):

SELECT count(*) FROM `bigquery-public-data.github_repos.sample_contents` 
  WHERE 
    ENDS_WITH(sample_path, ".java")
    AND REGEXP_CONTAINS(content, r'(?im)^\s*//\s*copyright')

Count: 12091

Java files that contain /* copyright, /** copyright or * copyright (accounting for spaces):=

SELECT count(*) FROM `bigquery-public-data.github_repos.sample_contents` 
  WHERE 
    ENDS_WITH(sample_path, ".java")
    AND REGEXP_CONTAINS(content, r'(?im)^\s*/?\*+\s*copyright')

Count 119028

So block style comments /* */ seem to be used for copyright statements in java files more often than per-line comments // by a factor of about 10 to 1.

Given that, I'm inclined to make this change. @mco-gh, any objection?

And since Kotlin was specifically mentioned in #65... block-style comments are also more common, but there is not actually enough data to draw any real conclusion (226 instances versus 7). I suspect that's because this sample_contents table was created in 2016, before Kotlin was as popular as it is today. We'd need to query the full contents table, which is still being updated today, to get meaningful data.

I guess Scala is the other big JVM language, and one that's particularly relevant for us at Twitter.

For what it's worth, our add_license_headers.py script uses javadoc style /** */ for Scala and Kotlin, but per-line style // for Java. 🤷🏻

commented