IRaftHttpCluster `Content-Length` header is being miss-used

Question

IRaftHttpCluster `Content-Length` header is being miss-used

Arkensor opened this issue 2 years ago · comments

Hi,

after a very stressful day, spending hours on debugging why my real world application deployment kept failing I have come across a design decision that I would like to dispute, as it results in a "bug".

I am talking about the miss use of of the HTTP Content-Length header. Right now there are implications for different values. Null is not the same as 0 and especially not the same as >0. The HTTP message implementation makes the (wrongful) assumption that if Content-Length has a value the data being sent is of kind OctetStreamLogEntry. "Normal" multipart messages of type MultipartLogEntry are null.

\src\cluster\DotNext.AspNetCore.Cluster\Net\Cluster\Consensus\Raft\Http\AppendEntriesMessage.cs L:275f

private static ILogEntryProducer<IRaftLogEntry> CreateReader(HttpRequest request, long count)
{
    string boundary;

    if (count is 0L)
    {
        // jump to empty set of log entries
    }
    else if (request.ContentLength.HasValue)
    {
        // log entries encoded as efficient binary stream
        return new OctetStreamLogEntriesReader(request.BodyReader, count);
    }
    else if ((boundary = request.GetMultipartBoundary()) is { Length: > 0 })
    {
        return new MultipartLogEntriesReader(boundary, request.Body, count);
    }

    return EmptyProducer;
}

As soon as I deployed to a cloud provider or my own docker setup I noticed that my instances kept crashing. Because what happens is that the HTTP proxy servers (e.g. Cloudflare) "correctly" process the HTTP data and set the "missing" Content-Length to the actual Body length, as per rfc2616 specification. So when those log entry arrive they are handled by the OctetStreamLogEntriesReader that will attempt to read some meta data and interprets the wrong bytes as sizes, resulting in trying to produce log entries with petabytes of data - at least that is what he thinks. That causes the allocator to throw because it is bigger than Array.Max and that is what kept crashing my instances (yes I purposefully did not catch that. it is intended to crash there)

I have put a setup side by side with a proxy in between and without, and all the messages are sent correctly - except Content-Length is no longer solely controlled by what the leader sends.

I "fixed" it by putting this rather ugly middle ware in front of the consensus handling to revert the info back to what the implementation "expected"

return app
    .Use(async (context, next) =>
    {
        //Hotfix for issue with dotNext raft not absuing content length incorrectly.
        if (context.Request.Path.StartsWithSegments(new($"{EnvironmentSettings.PublicEndpoint.LocalPath}cluster-message-bus/v1/consensus/raft")))
        {
            if (context.Request.ContentLength.HasValue && context.Request.ContentLength.Value != 0)
            {
                context.Request.ContentLength = null;
            }
        }

        await next(context);
    })
    .UseConsensusProtocolHandler()

This works because I am not using any kind of OctetStreamLogEntry. I only get empty ones or multipart (that carry my own types). I am not sure how this was intended to work, but you can not rely on the control over Content-Length. It has to be assumed that it will be equal to the actual body size when it arrives.

I propose the usage of the existing custom headers or the addition of a new one to correctly identify the kind of data being carried and how it should be handled, so that nothing else in the transport stack can mess with that.

Roman Sakno · Answer 1 · Fri Feb 25 2022 06:38:54 GMT+0800 (China Standard Time)

It's better to rely on Content-Type header instead of Content-Length. The fix is in develop branch already.

Arkensor · Answer 2 · Fri Feb 25 2022 21:43:57 GMT+0800 (China Standard Time)

Sounds reasonable to me - if the content type is correctly set by the library. I have not dealt with binary data directly yet as all my messages are JSON text based. If that works and is always set correctly, that should not be a solid indicator yeah.

Thank you for looking into it so quickly - yet again :)