racodond / sonar-jproperties-plugin

SonarQube Java Properties Analyzer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scanner fails whens there are mixed utf-8 and CP1252 encoding files in a project

KamiAgha opened this issue · comments

In a GitHub project we have:

  1.   Some of the files are cp1252 encoded others are utf-8
    
  2.   Some of the files that are cp1252 look like they were incorrectly changed from utf8
     a.       Looking at smrepair_de.properties the first line of the file is:  \u00ef\u00bb\u00bf# Zeichenfolgen f\u00fcr Reparaturmodule  
                                                            i.      Notice the escaped characters that are just before the #, those look like UTF-8 BOM chars
                                                          ii.      This file is 1252 encoded
     b.       Looking at smrepair_en.properties the first line is:  # strings for Repair Modules 
                                                            i.      This file is UTF-8 encoded
    

When we run the scan with the encoding set to CP1252 the scan does fail because it can’t handle the UTF-8 files.
When we run the scan with the encoded set to UTF-8 the scan does finish successfully, although it spits out a bunch of errors in the log. Those same errors show up in the sonarqube report as issues/bugs.

Parse error at line 1 column 21: 1: \u00ef\u00bb\u00bf# Zeichenfolgen f\u00fcr Reparaturmodule ^ 2: 3: SM-REPAIR-00010 = Automatisch erstelltes Agentenobjekt. Von XPSSweeper erstellt. 4: SM-REPAIR-00011 = Automatisch erstelltes AgentType-Objekt. Von XPSSweeper erstellt. 5: SM-REPAIR-00012 = Automatisch erstelltes Authentifizierungsschema-Objekt. Von XPSSweeper erstellt. 6: SM-REPAIR-00013 = Automatisch erstelltes Benutzerverzeichnis-Objekt. Von XPSSweeper erstellt. 7: SM-REPAIR-00014 = Automatisch erstelltes Agentengruppenobjekt. Von XPSSweeper erstellt. 8: 9: SM-REPAIR-00020 = Der Bereich ist ohne Agent oder Agentengruppe konfiguriert.\ 10: Wenn Sie den generierten \u00c4nderungssatz anwenden, wird dieses Problem behoben, indem Sie eine der folgenden Aktionen durchf\u00fchren: Bei verschachtelten Bereichen wird\ 11: der Bereich aktualisiert, um einen g\u00fcltigen Agenten aus der verschachtelten Bereichsstruktur zu verwenden.\

Hi @KamiAgha,

I'm not sure to understand your expectations.
Prior to running a SonarQube analysis, the first step in terms of quality would be to make file encoding consistent throughout your project (I would suggest UTF-8).
Then, you can run SonarQube.

David

Hello David,
Thank you for your prompt response. I do agree with you that the projects should not have mixed encoding. I am just suggesting that the plugin could be strengthened by behaving in a more consistent manner and provide scan error when utf-8 is specified the same way as it reports error when other format is used.

In our test Sonar Scanner also failed the scan in one instance instead of completing with scan errors.

Regards,
Kami

From: David RACODON [mailto:notifications@github.com]
Sent: Saturday, September 10, 2016 1:33 AM
To: racodond/sonar-jproperties-plugin sonar-jproperties-plugin@noreply.github.com
Cc: Shishegar, Kami Kami.Shishegar@ca.com; Mention mention@noreply.github.com
Subject: Re: [racodond/sonar-jproperties-plugin] Scanner fails whens there are mixed utf-8 and CP1252 encoding files in a project (#67)

Hi @KamiAghahttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_KamiAgha&d=DQMFaQ&c=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0&r=maK97bekaEkiXmyQXe_ZBK9mRnM-eiL-djKix65_kbg&m=J58P_I73fpnqLCPq2OSNcrgDVsNP8SP1Hqr2t44GmAw&s=BQ3OJLgxx8C2nh_LGmLeCp3eYBqWq8LOJ2JtpgeTt2g&e=,

I'm not sure to understand your expectations.
Prior to running a SonarQube analysis, the first step in terms of quality would be to make file encoding consistent throughout your project (I would suggest UTF-8).
Then, you can run SonarQube.

David


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_racodond_sonar-2Djproperties-2Dplugin_issues_67-23issuecomment-2D246099720&d=DQMFaQ&c=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0&r=maK97bekaEkiXmyQXe_ZBK9mRnM-eiL-djKix65_kbg&m=J58P_I73fpnqLCPq2OSNcrgDVsNP8SP1Hqr2t44GmAw&s=Mu5iiWASVGyg8jtD7ACpYiXgsNLLurDH-BPN7dJKCM0&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AI5UM-2DN96b-5FN7RTGoMxdC-5Fhmj6iBBZBFks5qomsngaJpZM4J5jsO&d=DQMFaQ&c=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0&r=maK97bekaEkiXmyQXe_ZBK9mRnM-eiL-djKix65_kbg&m=J58P_I73fpnqLCPq2OSNcrgDVsNP8SP1Hqr2t44GmAw&s=8FflGyOP4BF5TxAStIb_7WlFOYboOxzBtPB3slRqyUI&e=.

Hi @KamiAgha,

OK I see. Indee, it would be nice but it's not easy to deal with encoding (guessing the encoding of the file, making sure that the encoding property set for the SonarQube Scanner is consistent with the encoding of the files, etc.). See this thread for example: https://groups.google.com/d/msg/sonarqube/m22M4ABo7jA/6dJRV2RpBAAJ
Thus it is hard to properly behave when there's discrepancies regarding file encoding.

David