Spring gateway use reactor netty switch thread spend too many time
will-zdu opened this issue · comments
Expected Behavior
thread swtich quickly
Actual Behavior
thread swtich spend too many time
Steps to Reproduce
use spring cloud gateway org.springframework.cloud.gateway.filter.NettyRoutingFilter#filter
spring gateway:2.2.6.RELEASE
reactor netty:0.9.10
@Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
URI requestUrl = exchange.getRequiredAttribute(GATEWAY_REQUEST_URL_ATTR);
logger.info("log1")
return this.httpClient.request(method, url, req -> {
final HttpClientRequest proxyRequest = req.options(NettyPipeline.SendOptions::flushOnEach)
.headers(httpHeaders)
.chunkedTransfer(chunkedTransfer)
.failOnServerError(false)
.failOnClientError(false);
if (preserveHost) {
String host = request.getHeaders().getFirst(HttpHeaders.HOST);
proxyRequest.header(HttpHeaders.HOST, host);
}
logger.info("log2")
return proxyRequest.sendHeaders() //I shouldn't need this
.send(request.getBody().map(dataBuffer ->
((NettyDataBuffer)dataBuffer).getNativeBuffer()));
}).doOnNext(res -> {
ServerHttpResponse response = exchange.getResponse();
// put headers and status so filters can modify the response
HttpHeaders headers = new HttpHeaders();
res.responseHeaders().forEach(entry -> headers.add(entry.getKey(), entry.getValue()));
exchange.getAttributes().put("original_response_content_type", headers.getContentType());
HttpHeaders filteredResponseHeaders = HttpHeadersFilter.filter(
this.headersFilters.getIfAvailable(), headers, exchange, Type.RESPONSE);
response.getHeaders().putAll(filteredResponseHeaders);
HttpStatus status = HttpStatus.resolve(res.status().code());
if (status != null) {
response.setStatusCode(status);
} else if (response instanceof AbstractServerHttpResponse) {
// https://jira.spring.io/browse/SPR-16748
((AbstractServerHttpResponse) response).setStatusCodeValue(res.status().code());
} else {
throw new IllegalStateException("Unable to set status code on response: " +res.status().code()+", "+response.getClass());
}
// Defer committing the response until all route filters have run
// Put client response as ServerWebExchange attribute and write response later NettyWriteResponseFilter
exchange.getAttributes().put(CLIENT_RESPONSE_ATTR, res);
}).then(chain.filter(exchange));
}
log1 with log2 spend too many time ,actually use 6 seconds
check the cpu,loadaverage,gc,safepoint,is works well,and the other request is still working well
Possible Solution
Your Environment
- Reactor version(s) used: 0.9.10
- Other relevant libraries versions (eg.
netty
, ...): - JVM version (
java -version
): - OS and version (eg.
uname -a
):
@will-zdu You are using an unsupported version, also there is a known regression in 0.9.10
that is fixed in 0.9.14
. I would strongly encourage you to update your versions.
https://github.com/reactor/reactor-netty/releases/tag/v0.9.14.RELEASE
@violetagg #1371
I have seen your modification, but I still don't understand the root cause of this problem. How does subsrcibe not executing in eventloop cause race condition with request, which leads to long execution time,Could you please help me explain, or provide key relevant information, or I need to read that part of the relevant document
In my understanding, this may be the cause of the problem,
reactor.netty.channel.FluxReceive#drainReceiver
In this method,
If "receiver" is empty, then keep checking the loop until "receiver" is not empty and it works, but I still haven't found the reason why most of them are about 6 seconds out. Does it have something to do with that configuration?
If the request call is executed before subscribe, and subscribe is not in eventloop, is it because receiver is empty, but the eventloop thread keeps looping, and receiver is not volatile, resulting in eventloop The reason for cache consistency in the thread cannot be found even if the receiver is no longer empty
@will-zdu Did you update at least to the latest available 0.9.x release?
@violetagg have changed to use 0.9.14.release,but stil found timeout,but when restart it and check it again,no timeout found
@will-zdu If 0.9.14 doesn't solve the issue ... I can only recommend to upgrade to a supported version and if the problem still exists, to provide some reproducible example.
@violetagg 0.9.14 reproduce again in product env,now we try to use the latest available 0.9.x release
@will-zdu I'm closing this one. Please upgrade to a supported version and if the problem still exists, provide some reproducible example.