libwww-perl / HTTP-Message

The HTTP-Message distribution contains classes useful for representing the messages passed in HTTP style communication.

Home Page:https://metacpan.org/pod/HTTP::Message

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

decode_content issue with "Content-Encoding: none" [rt.cpan.org #100825]

oalders opened this issue · comments

Migrated from rt.cpan.org#100825 (status was 'open')

Requestors:

From maze@reik.se on 2014-12-10 11:23:40:

The decode_content seems to follow very accurately the standard, but 
unfortunately web servers do not.
I have one case where the web server is responding with a 
"Content-Encoding: none", and because of that
the character decoding isn't run, because of the else statement:
                 else {
                     die "Don't know how to decode Content-Encoding '$ce'";
                 }

I'm not sure how common this "none" case is, so it might not be 
legitimate to add that as an extra supported
Content-Encoding type (which does nothing).

As a simple workaround I can copy the decode_content function, but IMHO 
this is not ideal. It would be better
if the library would allow to specify the desired behavior in those cases.

An extra option could solve this:
                 else {
                     unless ($opt{ignore_unknown_content_encoding}) {
                         die "Don't know how to decode Content-Encoding 
'$ce'";
                     }
                 }

I think this approach could also be used for other reported issues like 
https://rt.cpan.org/Public/Bug/Display.html?id=82963

Cheers
Maze


From ether@cpan.org on 2014-12-10 23:09:48:

On 2014-12-10 03:23:40, maze@reik.se wrote:
> The decode_content seems to follow very accurately the standard, but 
> unfortunately web servers do not.
> I have one case where the web server is responding with a 
> "Content-Encoding: none", and because of that
> the character decoding isn't run, because of the else statement:
>                  else {
>                      die "Don't know how to decode Content-Encoding '$ce'";
>                  }

Googling for "Content-Encoding: none" shows some interesting things.  It looks like PHP is behind it (yay!).

This would be better than copying existing code:

    use Try::Tiny;

    # try to decode content, but fall back to undecoded as needed
    my $content = try {
        $response->decoded_content
    }
    catch {
        warn "Could not decode response content: $_";
        $response->content
    };

From maze@reik.se on 2014-12-11 08:25:17:

Thanks for your swift response.

Yes, that's a possible solution, but it's not gonna work :-(, because 
the character encoding would then be missing :-(  That's actually how we 
detected it, since suddenly we had some character encoding issues.

The decoded_content function unfortunately has two functions, one to do 
the Content-Encoding decoding and the other to do the Character 
decoding. Surely I could re-implement just that part but that's not that 
exciting neither and would make the
actual code more complicated :-(

I don't mind to temporarily having the duplicate (patched) code, but 
would of course upgrade to a newer version where this can be solved in a 
more elegant way.

Regarding the "right way to fix": Maybe, if it's that common, it might 
make sense to actually add it as an accepted encoding type. That was 
actually my first patch attempt, but since it's not standard it felt a 
bit wrong to do it.

Cheers
Maze



On 2014-12-11 0:09 , Karen Etheridge via RT wrote:
> <URL: https://rt.cpan.org/Ticket/Display.html?id=100825 >
>
> On 2014-12-10 03:23:40, maze@reik.se wrote:
>> The decode_content seems to follow very accurately the standard, but
>> unfortunately web servers do not.
>> I have one case where the web server is responding with a
>> "Content-Encoding: none", and because of that
>> the character decoding isn't run, because of the else statement:
>>                   else {
>>                       die "Don't know how to decode Content-Encoding '$ce'";
>>                   }
> Googling for "Content-Encoding: none" shows some interesting things.  It looks like PHP is behind it (yay!).
>
> This would be better than copying existing code:
>
>      use Try::Tiny;
>
>      # try to decode content, but fall back to undecoded as needed
>      my $content = try {
>          $response->decoded_content
>      }
>      catch {
>          warn "Could not decode response content: $_";
>          $response->content
>      };
>

From gaas@cpan.org on 2014-12-11 21:35:50:

This issue seems to have been fixed in
https://github.com/libwww-perl/http-message/commit/f0c8f81786fd442934e9eed08ea95be46cb7ba6f
(reported in #94882)

From maze@reik.se on 2014-12-12 08:53:33:

Yes this fix will do the trick. Thanks!

Cheers
Maze

On 2014-12-11 22:35 , Gisle_Aas via RT wrote:
> <URL: https://rt.cpan.org/Ticket/Display.html?id=100825 >
>
> This issue seems to have been fixed in
> https://github.com/libwww-perl/http-message/commit/f0c8f81786fd442934e9eed08ea95be46cb7ba6f
> (reported in #94882)
>

This ticket can be closed.