->decoded_content should decode application/json, etc [rt.cpan.org #82963]
oalders opened this issue · comments
Migrated from rt.cpan.org#82963 (status was 'open')
Requestors:
From mauke@cpan.org on 2013-01-25 22:13:06:
Currently $response->decoded_content will decode the bytes of e.g.
"Content-type: text/json; charset=UTF-8" messages because it knows
"text/*" is ... text.
It would be nice if this could be extended to also decode the text for
content-types such as "application/json; charset=UTF-8",
"application/javascript; charset=ISO-8859-15", etc.
From skaufman@cpan.org on 2013-07-21 18:05:19:
On Fri Jan 25 17:13:06 2013, MAUKE wrote:
> Currently $response->decoded_content will decode the bytes of e.g.
> "Content-type: text/json; charset=UTF-8" messages because it knows
> "text/*" is ... text.
>
> It would be nice if this could be extended to also decode the text for
> content-types such as "application/json; charset=UTF-8",
> "application/javascript; charset=ISO-8859-15", etc.
Bump, just ran into the same issue after a few hours.
in HTTP::Headers->content_is_text,
shouldn't the presence of charset in the content-type imply that the content is characters, ie text?
From swong@cpan.org on 2013-08-13 09:27:23:
Second on this.
When I say decode, I know what I am doing - currently there is no way to force it.
$response->decoded_content(charset => 'utf-8')
Adding (charset_strict => 1, raise_error => 1) doesn't help.
Better yet, the content type I get is
Content-Type: application/json; charset=UTF-8
Maybe content_is_text() should returns true if the charset is present in the content-type header?
From bbyrd@cpan.org on 2014-08-27 03:12:50:
Third. Currently, the code says:
if ($self->content_is_text || (my $is_xml = $self->content_is_xml)) {
Examples where LWP currently breaks include:
application/json
application/yaml
application/x-yaml
application/pdf
application/* (that isn't +xml)
The Content-Type really shouldn't matter. If the Content-Type is "pork/beans; charset=UTF-8", it should still be decoded.
If the remote agent broadcasted a charset, it's telling us that it had encoded that data with that character set. We shouldn't care if the data inside the onion is text, audio, application-specific, some proprietary format, whatever.
Please remove this 'if' line. It's a pretty intelligent interface, so it would be a waste of code to have other folks design their own decoding interface just because of this restriction.
As described in a lengthy comment on Always charset decode, regardless of content type, application/json
none of the examples mentioned here above should have any charset-decoding done automagically.
As recommended in RFC 6657 – MIME Charset Default Update §3b New Rules for Default "charset" Parameter Values for "text/*" Media Types, text/*
SHOULD
require explicit unconditional inclusion of the "charset" parameter, eliminating the need for a default value.
And thus Content-Type
MUST matter and the charset
parameter for any other media type is just 'auxiliary meta info' for a processor of the 'binary' data like application/*
. It is not the other way around , a charset
param DOES NOT mean that it is of type type/*
.
Saying «It's a pretty intelligent interface» is a great compliment, but I prefer an interface to do less guess work and do less automagically things under the hood. Especially regarding encoding/decoding layers. We all know it goes wrong too many times.
Sure, I'd like to have an intelligent interface, but not on the cost to make 'semi-core' Perl modules going to do the wrong thing. Maybe a HTTP::Message::MIME::ApplicationJSON
© would be a better alternative ?
Conclusion
$mess->decoded_contnent is doing the right thing
this issue SHOULD be closed
Looks like https://metacpan.org/release/LWP-JSON-Tiny might meet your criteria (haven't tried it myself though), for anyone else coming across this.
or a naughty over-ride HTTP::Message::JSON
Yes, that's from the same package :)
as described in #99, unless we add extra parameters or an entire new method, this can not be solved without breaking any existing API based on JSON
Closing this as we're working towards a solution in #99