bsorrentino / maven-confluence-plugin

Maven plugin that generates project's documentation directly to confluence allowing to keep in-sync project evolution with its documentation

Home Page:http://bsorrentino.github.io/maven-confluence-plugin/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Malformed unicode characters

karol-bujacek opened this issue · comments

I have tried to deploy documentation to the Confluence and I have noticed that unicode characters were malformed, see attached pictures.

Selection_1012_002

Selection_1012_001

I have got the same results (question mark in a black rectangle) also for markdown source files and for both cloud Confluence and server (7.4.11). I have tried to use <encoding>UTF-8</encoding> in plugin configuration (in pom.xml) but the result is the same.

My configuration:

            <plugin>
                <groupId>org.bsc.maven</groupId>
                <artifactId>confluence-reporting-maven-plugin</artifactId>
                <version>7.3.2</version>
                <configuration>
                    <childrenTitlesPrefixed>false</childrenTitlesPrefixed>
                    <parentPageTitle>(.........)</parentPageTitle>
                    <wikiFilesExt>.wiki</wikiFilesExt>
                    <siteDescriptor>${basedir}/etc/site/documentation/site.yaml</siteDescriptor>
                    <failOnError>true</failOnError>
                    <encoding>UTF-8</encoding>
                    <markdownProcessor>
                        <name>commonmark</name>
                    </markdownProcessor>
                </configuration>
                <executions>
                    <execution>
                        <id>cloud_wiki</id>
                        <goals>
                            <goal>deploy</goal>
                            <goal>delete</goal>
                        </goals>
                        <configuration>
                            <endPoint>https://(.........).atlassian.net/wiki/rest/api</endPoint>
                            <spaceKey>(.........)</spaceKey>
                            <username>(email-address)</username>
                            <password>(api-token)</password>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

Site.yaml:

home:
    uri: documentation.md
    parentPageTitle: "(.........)"
    name: "Documentation"

    children:
      - name: test wiki markup
        uri: test/wiki-markup.wiki

And page file wiki-markup.wiki:

h2. Hello world

{warning}
lorem ipsum
{warning}


nejaké špeciálne znaky?

diakritika čučoriedka, soľ, žriebä

„úvodzovky“

I do not know what other configuration parameters or versions may be relevant to this issue.

Hi @karol-bujacek

Thank for feedback I'll investigate on asap

Hi @karol-bujacek

I've replicated your case and tested it with plugin 7.2.3 on confluence 7.6.3 (my test instance) and it works well.

Have you checked the confluence general configuration ?

config

@bsorrentino , it is set to UTF-8 on cloud Confluence.

I will ask some colleague to reproduce the problem (and check if it is OS / environment related).

I also tried another plugin (https://github.com/confluence-publisher/confluence-publisher) which deploy pages with Unicode characters without any problems (both cloud and server Confluence).

@karol-bujacek that is strange

from my perspective It is hard to fix something that work well

However my test is here with the follow configuration

Configuration

       <!--
        mvn confluence-reporting:deploy@issue261
        mvn confluence-reporting:delete@issue261
        -->
        <execution>
            <id>issue261</id>
            <goals>
                <goal>deploy</goal>
                <goal>delete</goal>
            </goals>
            <configuration>
                <wikiFilesExt>.confluence</wikiFilesExt>
                <encoding>UTF-8</encoding>
                <failOnError>true</failOnError>
                <siteDescriptor>${basedir}/src/site/confluence/issue261/site.yml</siteDescriptor>
            </configuration>
        </execution>

Result

test

I have the same problem. Once I changed the encoding of my .md File to ISO-8859-1 and wrote the special chars (In my case german Umlauts äöü) with the ISO-8859-1 char, it worked, despite setting the encoding to UTF-8 in my pom.xml

I'm publishing currently from my windows pc with the git bash, but the environment ist defined as declare -x LANG="de_DE.UTF-8"

But my assumption is, it might have to do something with the system from which you are deploying.

Once I added export JAVA_TOOL_OPTIONS=-"Dfile.encoding=UTF-8" to my script before running maven, everything works.

So I assume, at some places the encoding property is ignored at the platform default is used. So it might be reproducable to have two executions - one with a ISO-8859-1 encoding and one with a UTF-8 encoding. I would be suprised if both works on the same machine

Hi @limdul79 thanks for feedback

Probably you right, I've to identify in the publish workflow if there are some steps where the provided encoding is ignored

Hi @limdul79

Seems that the problem could be related to a bus on minitemplator project

I gonna to fix that

Hi @limdul79

I've deployed dev release 7.4-SNAPSHOT with a fix

could you take a chance to test it and let me know ?

Thanks in advance

I will try next week, when I'm back at work.

I can confirm, it works. No I have the correct uniform characters even if I deploy from mein windows machine.

Thank so much @limdul79

I'll deploy a new release soon

fix released in version 7.4