PerlDancer / Dancer

The easiest way to write web applications with Perl (Perl web micro-framework)

Home Page:http://perldancer.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dancer 1.3200 breaks request->body for form POST

miyagawa opened this issue · comments

#1134 has broken Dancer's request->body for form-based POST requests. The body method now returns nothing for such requests.

Here's a failing test:

use Test::More import => ['!pass'];
use strict;
use warnings;
use Dancer::ModuleLoader;
use Dancer;
use File::Spec;
use lib File::Spec->catdir( 't', 'lib' );

plan skip_all => "skip test with Test::TCP in win32/cygwin" if ($^O eq 'MSWin32'or $^O eq 'cygwin');
plan skip_all => "Test::TCP is needed for this test"
    unless Dancer::ModuleLoader->load("Test::TCP" => "1.30");

use LWP::UserAgent;

use constant RAW_DATA => "foo=bar&bar=baz";

plan tests => 2;
Test::TCP::test_tcp(
    client => sub {
        my $port = shift;
        my $rawdata = RAW_DATA;
        my $ua = LWP::UserAgent->new;
        my $req = HTTP::Request->new(POST => "http://127.0.0.1:$port/jsondata");
        my $headers = { 'Content-Length' => length($rawdata), 'Content-Type' => 'application/x-www-form-urlencoded' };
        $req->push_header($_, $headers->{$_}) foreach keys %$headers;
        $req->content($rawdata);
        my $res = $ua->request($req);

        ok $res->is_success, 'req is success';
        is $res->content, $rawdata, "raw_data is OK";
    },
    server => sub {
        my $port = shift;

        use TestApp;
        Dancer::Config->load;

        set( environment  => 'production',
             port         => $port,
             server       => '127.0.0.1',
             startup_info => 0);
        Dancer->dance();
    },
);

and its output:

 prove -lbr t/02_request/19_post_body.t
t/02_request/19_post_body.t .. Use of uninitialized value in concatenation (.) or string at /Users/miyagawa/src/github.com/PerlDancer/Dancer/lib/Dancer/Renderer.pm line 63, <DATA> line 16.
t/02_request/19_post_body.t .. 1/2 
#   Failed test 'raw_data is OK'
#   at t/02_request/19_post_body.t line 30.
#          got: ''
#     expected: 'foo=bar&bar=baz'
# Looks like you failed 1 test of 2.
t/02_request/19_post_body.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/2 subtests 

Test Summary Report
-------------------
t/02_request/19_post_body.t (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
Files=1, Tests=2,  1 wallclock secs ( 0.01 usr  0.00 sys +  0.20 cusr  0.04 csys =  0.25 CPU)
Result: FAIL

Bugger. Looking in to this one today. It's weird, as the PR included tests which ensure that response->body works as it should.

Right - if you're POSTing raw content (application/json or application/octet-stream), then HTTP::Body::OctetStream stores it into a temp file stored in $self->body which we can read from. However, if you've posted url-encoded form data, then it's handled by HTTP::Body::UrlEncoded which parses the name=value pairs as it encounters them, setting the params appropriately immediately, but not storing the raw content.

This seems reasonable - there should be no need to get the raw request body in that case, it should be of no use - but /is/ a change in behaviour from how things worked before this.

I suspect an upstream patch to make HTTP::Body write the body content to a file in these cases would be rejected since in normal usage that's not useful. So, if we need to let request->body return raw content for url-encoded POSTs, then we may have to handle it ourselves - stashing it away into the request object as we parse the incoming request, and having request->body look for that first and return it if it was there, and if not, read and return the filehandle in _http_body.

Ah - but we won't know the type of request until we've passed enough data to HTTP::Body, and the chunks of data we read and pass to $self->{_http_body}->add will not line up with the boundary of the headers and body, so by the time we know whether we need to be storing the body or not, we may well have lost some of it in the previous chunk.

I'm starting to think we should document this change of behaviour clearly and leave it be - I can't see it biting many people, and the new behaviour doesn't seem unreasonable.

Well, one of my work apps was relying on this in a very core level, and everything has started crashing and misbehaving due to this change. Luckily we've caught in dev and pin to the old version now, but we can't upgrade it until either this gets fixed in Dancer or we add a workaround to it.

It's useless when one method returns something in some occasion and nothing in others - that can't be reliably used at all. I'm sympathetic because it's due to an odd interface that HTTP::Body gives to you. Plack::Request buffers the body using Stream::Buffer on its own, and make a copy of filehandle so that the users can get it: https://metacpan.org/source/MIYAGAWA/Plack-1.0037/lib/Plack/Request.pm

@miyagawa FWIW, Dancer2 uses Plack::Request internally. Does Dancer2 have the same bug?

Plack::Request doesn't suffer from this kind of thing, so I'd say as long as you use them, it's fine :)

Oh goodie. :)

Sorry this has stalled for so long - I hope to have a workaround written up this weekend which will store the body of requests if the HTTP::Body object hasn't stored it for us, but I need to work out a clean way to go about it. Perhaps even better would be to discuss with the author/maintainer of HTTP::Body whether a patch to HTTP::Body to always make the body content available via the body accessor would be accepted, though. HTTP::Body's doco helpfully documents the body accessor as "accessor for the body" but doesn't mention that it'll give you a filehandle, or that it'll only work at all for certain types of HTTP bodies - so it may be better for it to be fixed there.

In theory, yes. In reality, HTTP::Body has been semi-abandoned for a long time and there's little hope you can get it fixed there. In Plack we've recently switched to a custom HTTP parser tailored to handle PSGI environment. plack/Plack#538

Still looking at this one. I'd like to replace the use of HTTP::Body with the parser you're using in Plack, but I think it'd be an awful lot of work. Other options are to try hard to get a fix into HTTP::Body, or ship a modified version of HTTP::Body with Dancer, or implement some workaround in Dancer::Request which stores the request body if HTTP::Body is not going to do so - but I'm not sure if I can reliably detect whether it will before the fact - if not, it'd do away with the RAM usage savings from file uploads which the change which caused this problem was intended to provide.

I have raised a ticket against HTTP::Body with a patch which makes the UrlEncoded parser still store the raw request body, just like the other parsers - I'd love to see this be accepted and a new HTTP::Body release go out with it. If that doesn't happen, though, then I'll need to consider whether we move away from HTTP::Body, ship a modified version of it with Dancer, or try to implement some hacky patch (although having looked in to this option first, it looks like that will be tricky...)

Don't worry, as soon as bigpresh talks to me he can have commit access and co-maint.

HTTP::Body is semi-maintained because people keep not even trying to help maintain it; I can provide people with access as required.

As per comment on other issue, if you are happy to give me a commit bit & co-maint, I'll happily help maintain it.

I've been working on implementing the fix in HTTP::Body now mst has kindly set me up with access. Most of the way there with something that will work. The sticking point I'm working on is how it should work for multipart requests (e.g. file uploads) - the least surprising behaviour would be combining all the parts, leaving the boundaries in place etc - but that's unlikely to be of much use in most cases, and also difficult to implement without keeping the contents of all parts in a scalar, or writing the first part (URL-encoded form upload, usually) to disk along with the file parts, both of which are rather unacceptable. At this point I'm tempted to have it warn and return if body is called for a multipart request, or just have it return the first part only - but both of them would be somewhat surprising, if you expect $http_request->body() to always return a filehandle you can read the body from.