racodond / sonar-jproperties-plugin

SonarQube Java Properties Analyzer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

New rule: ISO-8859-1 characters not compatible with UTF-8 should be escaped to keep compatibility with Java 9 default encoding switch

arend-von-reinersdorff opened this issue · comments

Java 9 will switch the default property encoding from ISO-8859-1 to UTF-8:
http://openjdk.java.net/jeps/226

This will lead to garbled input if a .properties file for Java 8 or earlier is read by Java 9 in case it
contains non-ASCII, non-escaped characters. Eg:
admin.name=Jörg Schäfer

A .properties file that should be read by Java 9 and Java 8 or earlier should escape all non-ASCII characters.

Hi @arend-von-reinersdorff,

Thanks for the info and the link!
I'll create such a rule for the next release.

David

Depends upon #76

@arend-von-reinersdorff: What about the following rule description?

<p>
    Java 9 expects properties file to be encoded in UTF-8 instead of ISO-8859-1. Even if Java 9 provides some fallback
    mechanisms to ISO-8859-1 while loading properties, in some corner cases, you might face unexpected behaviors for
    ISO-8859-1 characters not matching UTF-8 characters (meaning characters whose code points are over U+007F). For
    instance, instead of <code>Jörg</code>, <code>J�rg</code> might be displayed. To make sure to avoid any display
    issue, either:
</p>
<ul>
    <li>Escape all characters whose code points are over U+007F with Unicode escapes (<code>\uXXXX</code>)</li>
    <li>Or explicitly load properties files with ISO-8859-1 encoding</li>
</ul>
<p>
    This rule applies only when `sonar.jproperties.sourceEncoding` is set to `ISO-8859-1' (default value) and it raises
    an issue each time a character whose code point is over U+007F is found.
</p>


<h2>Noncompliant Code Example</h2>
<pre>
my.name: Jörg
</pre>

<h2>Compliant Solution</h2>
<pre>
my.name: J\u00f6rg
</pre>

<h2>See</h2>
<ul>
    <li><a target="_blank"
           href="https://docs.oracle.com/javase/9/intl/internationalization-enhancements-jdk-9.htm#JSINT-GUID-5ED91AA9-B2E3-4E05-8E99-6A009D2B36AF">Oracle
        - Internationalization Enhancements in JDK 9</a></li>
    <li><a target="_blank" href="http://openjdk.java.net/jeps/226">OpenJDK - JEP 226: UTF-8 Property Files</a></li>
</ul>

Great work, and very nice description. Thanks a lot :-)

You're welcome! Here's a snapshot to test: https://github.com/racodond/sonar-jproperties-plugin/releases/tag/%2375

Your feedback is more than welcome!

I tried to trigger the new issue but didn't manage. My setup:

  • SonarQube 5.6.6 LTS
  • default local database
  • analyzed with Maven

I was able to trigger another issue in my test.properties file but not this new one.

At first I used UTF-8 as Maven project encoding and ISO-8859-1 as encoding for the property file (this should be the normal case). This caused a warning on analysis:
[WARNING] Invalid character encountered in file [...]\src\main\java\test.properties at line 5 for encoding UTF-8. Please fix file content or configure the encoding to be used using property 'sonar.sourceEncoding'.
Also the non-ASCII characters were garbled when viewing the file in the SonarQube server view.

When I changed the Maven project encoding to ISO-8859-1 (ugly workaround) the warning disappeared but the issue was still not triggered. Non-ASCII characters looked fine in the SonarQube server view.

Unrelated problems in testing this:

  • The Properties plugin is not in the SonarQube update center and not in the Sonarqube version compatibility matrix. README.md should be updated.
  • Property file was ignored in src/main/resources which would be the default location in Maven. I had to put it in src/main/java

Hi @arend-von-reinersdorff,

Thanks for your feedback!

I was able to trigger another issue in my test.properties file but not this new one

It works fine on my side with your settings with the following project sample:
test.zip

My apologies to ask :-):

Can you try again with my sample project?

At first I used UTF-8 as Maven project encoding and ISO-8859-1 as encoding for the property file (this should be the normal case). This caused a warning on analysis:
[WARNING] Invalid character encountered in file [...]\src\main\java\test.properties at line 5 for encoding UTF-8. Please fix file content or configure the encoding to be used using property 'sonar.sourceEncoding'.
Also the non-ASCII characters were garbled when viewing the file in the SonarQube server view.

Of course, the proper settings should be:

sonar.sourceEncoding=UTF-8
sonar.jproperties.sourceEncoding=ISO-8859-1

But, currently, no language plugin seems to support files with different encodings. I asked about it here. Unfortunately, it is likely that SonarSource doesn't answer the thread as they don't really welcome language plugins from the community. I'll try to keep investigating to find a workaround when I have some time.

The Properties plugin is not in the SonarQube update center and not in the Sonarqube version compatibility matrix. README.md should be updated.

README file updated

Property file was ignored in src/main/resources which would be the default location in Maven. I had to put it in src/main/java

This is related to the SonarQube Maven plugin that only looks for files in src/main/java. There's a ticket to also automatically take into account files in src/main/resources. See https://jira.sonarsource.com/browse/MSONAR-123
For now, you have to set the sonar.sources property in your pom file (see my project sample).

David

You are right, the rule was not activated, sorry.
It works very nicely, thank you very much :-)

Good news!
I'll try to find a solution about the encoding before an official release.