reactor / reactor-netty

TCP/HTTP/UDP/QUIC client/server with Reactor over Netty

Home Page:https://projectreactor.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TlsMetricsHandler throws NPE when used together with SniHandler

AndreasKasparek opened this issue · comments

We have a Spring cloud gateway application using Webflux that we recently updated from org.springframework.cloud version 2022.0.4 to 2023.0.0.
In our bean configuration we set a NettyServerCustomizer that registers a NoOp (all callback methods are empty) ChannelMetricsRecorder because we are not interested in the connection metrics but in the ByteBuf allocation metrics that we couldn't enable otherwise.

@Bean
public WebServerFactoryCustomizer<NettyReactiveWebServerFactory> customizeNettyServerFactory() {
   // noopRecord is an empty ChannelMetricsRecorder
   return factory -> {
       factory.addServerCustomizers(server -> server.metrics(true, () -> noopRecorder));
   };
}

With the previous Spring cloud gateway version this code worked fine, presumably because no SniProvider was created, but now it leads to an error when trying to establish a SSL connection.

Expected Behavior

No error when opening an SSL connection to the application (as server).

Actual Behavior

It seems that since the Spring update the org.springframework.boot.web.embedded.netty.SslServerCustomizer (spring-boot:3.2.0) now always calls reactor.netty.tcp.SslProvider.Builder#setSniAsyncMappings. This has the effect that the reactor.netty.tcp.SslProvider class will create a reactor.netty.tcp.SniProvider when constructed via builder.

When enabling channel metrics via the above mentioned server customizer, Netty automatically registers a TlsMetricsHandler within the AbstractChannelMetricsHandler. This TLS metrics handler provokes a null pointer exception when an SSL connection should be established with the server.

Even so the change that triggered this is probably in the Spring code, I assume that the root cause is actually a bug in reactor-netty-core (see next section).

Steps to Reproduce

When an SniProvider exists, the SslProvider#addSslHandler method will delegate the work to the SniProvider by calling SniProvider#addSniHandler. And in that method a new SniHandler instance is add to the pipeline via:

pipeline.addFirst(NettyPipeline.SslHandler, newSniHandler());   // Please note that the first argument is a string constant.

The AbstractChannelMetricsHandler#channelRegister method checks if the pipeline contains an ssl handler by doing a lookup by name and if yes it registers a TLS metrics handler:

if (ctx.pipeline().get(NettyPipeline.SslHandler) != null) {  // Note that this is again using the string constant!
	ctx.pipeline()
	      .addBefore(NettyPipeline.SslHandler,   // this too
                         NettyPipeline.TlsMetricsHandler, tlsMetricsHandler());
}

The TlsMetricsHandler#channelActive method however asks the pipeline for an SslHandler class by type instead of using the NettyPipeline.SslHandler name. An SniHandler however is not an instance of SslHandler, meaning the pipeline does not contain any matching class and therefore returns null, which leads to the exception:

ctx.pipeline()
	.get(SslHandler.class)   // returns null
	.handshakeFuture()
        ...

Possible Solution

I don't know what the intention was, but either the SniHandler has to implement a common interface together with the SslHandler to support handshakeFuture() that is needed by the TlsMetricsHandler, or the metrics handler should be registered only if a handler of type SslHandler is part of the pipeline but not for SniHandlers (using lookup by type instead of name).

  • Reactor version(s) used: reactor-core 3.6.0
  • Other relevant libraries versions (eg. netty, ...): reactor-netty-core 1.1.13, spring-boot 3.2.0
  • JVM version (java -version): Corretto-17.0.7.7.1

@AndreasKasparek Thanks for the detailed explanation! This should be fixed with #3023

Thank you very much @violetagg, that was very fast!