puma / puma

A Ruby/Rack web server built for parallelism

Home Page:https://puma.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Requests with a caret (`<` or `>`) in a query parameter raise a `Puma::HttpParserError`

jclusso opened this issue · comments

Describe the bug
Requests with a caret (< or >) in a query parameter raise a Puma::HttpParserError

I'm aware this is an invalid URL and I'm not sure how this should be handled. Feels like something we'd handle as a 400 at the application level instead of Puma breaking and being handled as a lowlevel_error.

To add additional context, our platform works with emails, and sometimes people will make invalid requests to our API with something like "?email=john@doe.com>". I'm assuming this is due to bad email extraction from a contact formatted like "John Doe john@doe.com". We'd like to handle this better.

Puma config:

# This configuration file will be evaluated by Puma. The top-level methods that
# are invoked here are part of Puma's configuration DSL. For more information
# about methods provided by the DSL, see https://puma.io/puma/Puma/DSL.html.

# Puma can serve each request in a thread from an internal thread pool.
# The `threads` method setting takes two numbers: a minimum and maximum.
# Any libraries that use thread pools should be configured to match
# the maximum value specified for Puma. Default is set to 5 threads for minimum
# and maximum; this matches the default thread size of Active Record.
max_threads_count = ENV.fetch("MAX_THREADS") { 5 }
min_threads_count = ENV.fetch("MIN_THREADS") { max_threads_count }
threads min_threads_count, max_threads_count

if %w(staging production).include?(ENV["RAILS_ENV"])
  # Specifies that the worker count should equal the number of processors in production.
  worker_count = Integer(ENV.fetch("WEB_CONCURRENCY") { Concurrent.physical_processor_count })
  workers worker_count if worker_count > 1
end

# Specifies the `worker_timeout` threshold that Puma will use to wait before
# terminating a worker in development environments.
worker_timeout 3600 if ENV.fetch("RAILS_ENV", "development") == "development"

# Specifies the `port` that Puma will listen on to receive requests; default is 3000.
port ENV.fetch("PORT") { 3000 }

# Specifies the `environment` that Puma will run in.
environment ENV.fetch("RAILS_ENV") { "development" }

# Specifies the `pidfile` that Puma will use.
pidfile ENV.fetch("CUSTOM_WEB_PID_FILE") { "tmp/pids/server.pid" }

# Set the directory to Cloud 66 specific environment variable so that puma can follow symlinks to new code on redeployment
directory ENV.fetch("STACK_PATH") { "." }
# Make sure to bind to Cloud 66 specific socket so that NGINX can direct traffic here
bind ENV.fetch("BIND") { "unix:///tmp/web_server.sock" }

prune_bundler
drain_on_shutdown

# Allow puma to be restarted by `bin/rails restart` command.
plugin :tmp_restart

# Use the following to test:
# curl -v -A $(echo -ne "user\x1fagent") http://localhost:3000/
lowlevel_error_handler do |error, env|
  ...
end

# https://www.mongodb.com/docs/mongoid/current/reference/configuration/#puma
on_worker_boot do
  Mongoid::Clients.clients.each do |name, client|
    client.close
    client.reconnect
  end
end

before_fork do
  Mongoid.disconnect_clients
end

plugin :statsd

To Reproduce
Make a request with a caret in a query parameter.

curl "http://localhost:3000/?param=>"

Expected behavior
Does not raise a Puma::HttpParserError.

Desktop (please complete the following information):

  • OS: Mac, Linux
  • Puma Version: 6.4.2

@jclusso

Works fine with a browser. > should be encoded as %3E. AFAIK, curl doesn't encode.

@MSP-Greg the request is not being made with a browser.

Also, after further digging, it seems like a Puma::HttpParserError should return a 400 response by this code that rescues it. When testing without a lowlevel_error_handler, I do get a 400 response.

I'm confused why it's triggering the lowlevel_error_handler since I don't see how that works. I can see that the check for a MiniSSL::SSLError calls lowlevel_error, but the one for HttpParserError just calls response_to_error.

puma/lib/puma/server.rb

Lines 533 to 546 in 58c31b2

case e
when MiniSSL::SSLError
lowlevel_error(e, client.env)
@log_writer.ssl_error e, client.io
when HttpParserError
response_to_error(client, requests, e, 400)
@log_writer.parse_error e, client
when HttpParserError501
response_to_error(client, requests, e, 501)
@log_writer.parse_error e, client
else
response_to_error(client, requests, e, 500)
@log_writer.unknown_error e, nil, "Read"
end

If that is intended, we should update the README.md with an example of the lowlevel_error_handler that returns a 400 for this error and a 500 for everything else.

puma/README.md

Lines 182 to 187 in 58c31b2

```ruby
lowlevel_error_handler do |e|
Rollbar.critical(e)
[500, {}, ["An error has occurred, and engineers have been informed. Please reload the page. If you continue to have problems, contact support@example.com\n"]]
end
```

@jclusso

Stop. > is an invalid character in a URI. It needs to be encoded. See Ruby's uri and net/http std-lib items, both of them encode the character.

See https://www.rfc-editor.org/rfc/rfc3986.html#section-3.4, and also the following info in Appendix A. Collected ABNF for URI:

   pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

   query         = *( pchar / "/" / "?" )

   fragment      = *( pchar / "/" / "?" )

   pct-encoded   = "%" HEXDIG HEXDIG

   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
   reserved      = gen-delims / sub-delims
   gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="

Stop what? I'm well aware it's an invalid URL. I wrote that in the original post. And in my previous post I brought up the handling questions. Can you help me understand why what I described there is happening?

@jclusso

Sorry, I didn't see the edits to your original post, as I just looked at the email (original post)... Today's a bad day...

@MSP-Greg no worries. At this point I'm just trying to get to the bottom of this. Whenever you get a chance, I'd appreciate if you could give me some insight on what I addressed here.

This should be solved with updated documentation. I agree that the docs could be improved in this area.

If you don't specify an lowlevel_error_handler, Puma will respond with 400 Bad Request. Before Puma v6.4.0, the lowlevel_error_handler wasn't used at all for this error, but that changed with #3094.

I feel like it would be nice if lowlevel_error_handler had a way to return the existing response rather than have to craft a new one. Would this be something you guys would be open to?

I think that's already possible?

puma/lib/puma/server.rb

Lines 549 to 560 in 58c31b2

# A fallback rack response if +@app+ raises as exception.
#
def lowlevel_error(e, env, status=500)
if handler = options[:lowlevel_error_handler]
if handler.arity == 1
return handler.call(e)
elsif handler.arity == 2
return handler.call(e, env)
else
return handler.call(e, env, status)
end
end

$ echo 'app { [200, {}, ["OK"]] }; lowlevel_error_handler { |e,env,status| [status, {}, ["error"]] }' | puma --config /dev/stdin --port 0 --log-requests
Puma starting in single mode...
* Puma version: 6.4.2 (ruby 3.2.3-p157) ("The Eagle of Durango")
*  Min threads: 0
*  Max threads: 5
*  Environment: development
*          PID: 54978
* Listening on http://0.0.0.0:63391
Use Ctrl-C to stop
2024-03-16 00:18:03 +0100 HTTP parse error, malformed request ("GET /" - (-)): #<Puma::HttpParserError: Invalid HTTP format, parsing fails. Are you trying to open an SSL connection to a non-SSL Puma?>
$ curl -s -v "http://localhost:63391/?param=>"
*   Trying [::1]:63391...
* connect to ::1 port 63391 failed: Connection refused
*   Trying 127.0.0.1:63391...
* Connected to localhost (127.0.0.1) port 63391
> GET /?param=> HTTP/1.1
> Host: localhost:63391
> User-Agent: curl/8.4.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 400 Bad Request
< Content-Length: 5
<
* Excess found in a read: excess = 28, size = 5, maxdownload = 5, bytecount = 0
* Closing connection
error