dotnet / extensions

This repository contains a suite of libraries that provide facilities commonly needed when creating production-ready applications.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HTTP resiliency features don't work with the .NET gRPC client

DamianEdwards opened this issue · comments

The HTTP resiliency features, including those added by the IHttpClientBuilder.AddStandardResilienceHandler method, don't apply to gRPC calls despite them going through configured HttpClient instances. This is due to the gRPC stack not exposing error details at the HTTP request level in the way that the resiliency features expect (e.g. using HTTP status codes).

The following code example, typical of setting up a gRPC client in a .NET server application, will not actually result in the standard resiliency features being applied to gRPC calls:

builder.Services.AddGrpcClient<Basket.BasketClient>(o => o.Address = new("http://basket-api"))
    .AddStandardResilienceHandler();

Consider adding support for the standard resiliency patterns to the .NET gRPC client stack in a similar fashion to those added to the HttpClient stack so that resiliency features like Circuit Breaker can be easily added by default.

/Cc @JamesNK @davidfowl

The easy enhancement is to improve the HttpClientResiliencePredicates to also detect gRPC calls and handle retriable status codes:

internal static bool IsTransientHttpFailure(HttpResponseMessage response)

This should make both retry and circuit breaker strategy work for gRPC. The other issue is handling of streamed calls, which I am not sure how to address.

gRPC always return 200 status code. Failure is communicated in grpc-status trailer.

I haven't looked at how resilience works, but I'm guessing the retry happens inside a HTTP handler's SendAsync. gRPC supports streaming an error can occur long after response status is returned and SendAsync has run.

I think a known limitation will be that streaming gRPC calls won't be retried. However, failing unary calls should be detectable. Look for a 200 status code and also check the response headers for grpc-status. They will both be available in SendAsync.

Failure is communicated in grpc-status trailer.

The trailer is available only after the response body is finished reading, is that correct? I am wondering how we can ensure that trailer is available for gRPC calls. Otherwise, the retries won't work.

Will buffering the content work?

Will buffering the content work?

No.

If an error happens before any content is returned by the server, then grpc-status is in the headers. That is the scenario that will work. It's confusingly named Trailers-Only in the spec - https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#responses

ITNOA

Any plan to implement specific extensions for support gRPC in Microsoft.Extensions.Resilience?

thanks

Any plan to implement specific extensions for support gRPC in Microsoft.Extensions.Resilience?

That is what this issue is tracking. No committed timelines yet, so for now we just want to continue this discussion.