reactor / reactor-netty

TCP/HTTP/UDP/QUIC client/server with Reactor over Netty

Home Page:https://projectreactor.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spring gateway use reactor netty switch thread spend too many time

will-zdu opened this issue · comments

commented

Expected Behavior

thread swtich quickly

Actual Behavior

thread swtich spend too many time

Steps to Reproduce

use spring cloud gateway org.springframework.cloud.gateway.filter.NettyRoutingFilter#filter
spring gateway:2.2.6.RELEASE
reactor netty:0.9.10

@Override
	public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
		URI requestUrl = exchange.getRequiredAttribute(GATEWAY_REQUEST_URL_ATTR);

                
                logger.info("log1")
		return this.httpClient.request(method, url, req -> {
			final HttpClientRequest proxyRequest = req.options(NettyPipeline.SendOptions::flushOnEach)
					.headers(httpHeaders)
					.chunkedTransfer(chunkedTransfer)
					.failOnServerError(false)
					.failOnClientError(false);

			if (preserveHost) {
				String host = request.getHeaders().getFirst(HttpHeaders.HOST);
				proxyRequest.header(HttpHeaders.HOST, host);
			}

                        logger.info("log2")
			return proxyRequest.sendHeaders() //I shouldn't need this
					.send(request.getBody().map(dataBuffer ->
							((NettyDataBuffer)dataBuffer).getNativeBuffer()));
		}).doOnNext(res -> {
			ServerHttpResponse response = exchange.getResponse();
			// put headers and status so filters can modify the response
			HttpHeaders headers = new HttpHeaders();

			res.responseHeaders().forEach(entry -> headers.add(entry.getKey(), entry.getValue()));

			exchange.getAttributes().put("original_response_content_type", headers.getContentType());

			HttpHeaders filteredResponseHeaders = HttpHeadersFilter.filter(
					this.headersFilters.getIfAvailable(), headers, exchange, Type.RESPONSE);
			
			response.getHeaders().putAll(filteredResponseHeaders);
			HttpStatus status = HttpStatus.resolve(res.status().code());
			if (status != null) {
				response.setStatusCode(status);
			} else if (response instanceof AbstractServerHttpResponse) {
				// https://jira.spring.io/browse/SPR-16748
				((AbstractServerHttpResponse) response).setStatusCodeValue(res.status().code());
			} else {
				throw new IllegalStateException("Unable to set status code on response: " +res.status().code()+", "+response.getClass());
			}

			// Defer committing the response until all route filters have run
			// Put client response as ServerWebExchange attribute and write response later NettyWriteResponseFilter
			exchange.getAttributes().put(CLIENT_RESPONSE_ATTR, res);
		}).then(chain.filter(exchange));
	}

log1 with log2 spend too many time ,actually use 6 seconds
check the cpu,loadaverage,gc,safepoint,is works well,and the other request is still working well

Possible Solution

Your Environment

  • Reactor version(s) used: 0.9.10
  • Other relevant libraries versions (eg. netty, ...):
  • JVM version (java -version):
  • OS and version (eg. uname -a):

@will-zdu You are using an unsupported version, also there is a known regression in 0.9.10 that is fixed in 0.9.14. I would strongly encourage you to update your versions.

https://github.com/reactor/reactor-netty/releases/tag/v0.9.14.RELEASE

commented

@violetagg #1371
I have seen your modification, but I still don't understand the root cause of this problem. How does subsrcibe not executing in eventloop cause race condition with request, which leads to long execution time,Could you please help me explain, or provide key relevant information, or I need to read that part of the relevant document
In my understanding, this may be the cause of the problem,
reactor.netty.channel.FluxReceive#drainReceiver
In this method,
If "receiver" is empty, then keep checking the loop until "receiver" is not empty and it works, but I still haven't found the reason why most of them are about 6 seconds out. Does it have something to do with that configuration?
If the request call is executed before subscribe, and subscribe is not in eventloop, is it because receiver is empty, but the eventloop thread keeps looping, and receiver is not volatile, resulting in eventloop The reason for cache consistency in the thread cannot be found even if the receiver is no longer empty

@will-zdu Did you update at least to the latest available 0.9.x release?

commented

@violetagg have changed to use 0.9.14.release,but stil found timeout,but when restart it and check it again,no timeout found

@will-zdu If 0.9.14 doesn't solve the issue ... I can only recommend to upgrade to a supported version and if the problem still exists, to provide some reproducible example.

commented

@violetagg 0.9.14 reproduce again in product env,now we try to use the latest available 0.9.x release

@will-zdu I'm closing this one. Please upgrade to a supported version and if the problem still exists, provide some reproducible example.