mquinson / po4a

Maintain the translations of your documentation with ease (PO for anything)

Home Page:http://po4a.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error while build v0.71 on openSUSE

elchevive opened this issue · comments

Hi,

I'm trying to update po4a from v0.69 to v0.71 on openSUSE but on the versions of openSUSE with older software (Leap 15.5 and the to be released 15.6) I got this error message:

[   13s] Discard blib/man/ca/man3/Locale::Po4a::Xhtml.3pm.pod (13 of 21 strings; only 61.9% translated; need 80%).
[   13s] Malformed encoding while writing to file /home/abuild/rpmbuild/BUILD/po4a-0.71/blib/man/ca/man3/Locale::Po4a::Xml.3pm.pod with charset UTF-8: "\x{00c3}" does not map to utf8 at lib/Locale/Po4a/TransTractor.pm line 513.
[   13s] Close with partial character at lib/Locale/Po4a/TransTractor.pm line 526.
[   13s] Died at Po4aBuilder.pm line 191.
[   13s] error: Bad exit status from /var/tmp/rpm-tmp.QnsSUO (%build)

On tumbleweed (the rolling release one) it build. The only difference I saw is Perl version, 5.26 on Leap e 5.38 on Tumbleweed.

You can see my attempts on this repository:

https://build.opensuse.org/package/show/home:elchevive:branches:devel:languages:perl/po4a

Regards

This is really weird. We require Perl 5.12 since po4a v0.70 because we changed the way we handle UTF files. But I still fail to understand why it would be an issue on Perl 5.26.

Could you give me the output of file po/pod/ca.po and of grep Encoding po/pod/ca.po please?

Hi,

[    8s] + /usr/bin/mkdir /home/abuild/rpmbuild/BUILDROOT/po4a-0.71-150500.105.1.x86_64
[    8s] + cd po4a-0.71
[    8s] + file po/pod/ca.po
[    8s] po/pod/ca.po: GNU gettext message catalogue, UTF-8 Unicode text, with very long lines
[    8s] + grep Encoding po/pod/ca.po
[    8s] "Content-Transfer-Encoding: 8bit\n"
[    8s] + perl Build.PL installdirs=vendor
...
[   13s] Discard blib/man/ca/man3/Locale::Po4a::Xhtml.3pm.pod (13 of 21 strings; only 61.9% translated; need 80%).
[   13s] Malformed encoding while writing to file /home/abuild/rpmbuild/BUILD/po4a-0.71/blib/man/ca/man3/Locale::Po4a::Xml.3pm.pod with charset UTF-8: "\x{00c3}" does not map to utf8 at lib/Locale/Po4a/TransTractor.pm line 513.
[   13s] Close with partial character at lib/Locale/Po4a/TransTractor.pm line 526.
[   13s] Died at Po4aBuilder.pm line 191.
[   14s] error: Bad exit status from /var/tmp/rpm-tmp.GZEM2g (%build)

If that helps, I get the exact same issue on Ubuntu 20.04, with perl 5.30.0 and these commands shows the same results. except that instead of "\x{00c3}" I have "\x{fffd}"

I tried on the 0.70 version of po4a
And got the same problem :

$ ./Build 
$ Created META.yml and META.json
$ "\x{fffd}" does not map to UTF-8 at lib/Locale/Po4a/Po.pm line 613.
$ Close with partial character at lib/Locale/Po4a/Po.pm line 613.
$ Died at Po4aBuilder.pm line 169.

Using the command you requested (file and grep), I get the same output with the exception of the , with very long lines which does not appear anymore

Hi,

As an excercise I update Leap 15.5 perl packages to 5.38 and po4a compiles sucessfully, so its something that change in Perl between 5.30 (as mentioned by Gastonia02) and 5.38

Thanks @elchevive, that's a precious info. Is there any chance to get the precise version of Perl for which po4a starts to fail?

I started reading the perldelta of each versions between 5.30 and 5.38, but that's quite a lot of changes actually.

Do we think this is the same bug as #494 ?

Hi,

Further testing shows me that some change between 5.33.6 (not working) and 5.33.7 (start working) should be the culprit.

The diff between the two versions regarding PerlIO encoding seems to be the following:
https://metacpan.org/release/ATOOMIC/perl-5.33.8/diff/HYDAHY%2Fperl-5.33.6/ext/PerlIO-encoding/encoding.pm
The fallback setting does not contain Encode::STOP_AT_PARTIAL() anymore. Further digging underway.

The full diff between 5.33.6 and 5.33.7 is here: https://metacpan.org/release/RENEEB/perl-5.33.7/view/pod/perldelta.pod

I fail to reproduce the error :( Could someone test that the following patch helps? Alternatively, the commented line could be used instead of the one added without comments.

--- a/lib/Locale/Po4a/TransTractor.pm
+++ b/lib/Locale/Po4a/TransTractor.pm
@@ -504,6 +504,8 @@ sub write {
             File::Path::mkpath( $dir, 0, 0755 )    # Croaks on error
               if ( length($dir) && !-e $dir );
         }
+        $PerlIO::encoding::fallback = FB_CROAK;
+        # $PerlIO::encoding::fallback = Encode::PERLQQ()|Encode::WARN_ON_ERR()|Encode::ONLY_PRAGMA_WARNINGS();
         open( $fh, ">:encoding($charset)", $filename )
           or croak wrap_msg( dgettext( "po4a", "Cannot write to %s: %s" ), $filename, $! );
     }

Do we think this is the same bug as #494 ?

Nope, I don't think it's the same. I think that #495 is about partial chars being reported as an error in Perl 5.33 and not in modern ones while #494 was about a eval block returning false even in absence of error.

Another clue that it's not the same is that #495 shows the error msg Close with partial character at lib/Locale/Po4a/TransTractor.pm line 526 while #494 does not show anything before dying ("unknown error").

And a final clue: I was able to reproduce (and fix) #494 while I'm still trying to reproduce #495

commented

I'm still getting these errors on an old system running RHEL 7.1, Perl 5.16
v0.70: "\x{fffd}" does not map to UTF-8 at lib/Locale/Po4a/Po.pm line 613.
v0.73-17-g76a463e5:

Malformed encoding while writing to file /shared/src/po4a/blib/man/ca/man3/Locale::Po4a::Xml.3pm.pod with charset UTF-8: "\x{fffd}" does not map to UTF-8 at 
lib/Locale/Po4a/TransTractor.pm line 544.
If UTF-8 is not the expected charset, you need to configure the right one with with --localized-charset or other similar flags.
Close with partial character at lib/Locale/Po4a/TransTractor.pm line 568.

v0.69 works

I bisected the failure to the merges around 15abd24 . 15abd24^ succeeds whereas b2333d5 fails.
15abd24 itself actually gives me a different error message:

po4a::xml: The file declares ISO-8859-1 as encoding, but you provided UTF-8 as master charset. Please change either setting.                                  
 at po4a line 1624.