Nike-Inc / riposte

Riposte is a Netty-based microservice framework for rapid development of production-ready HTTP APIs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ProxyRouterEndpoint response data is not available

blling opened this issue · comments

I want to use Riposte to proxy third party endpoints and cache the response data and when the caller call the endpoints again i will return the cached data to the caller.
But ProxyRouterEndpoint could not get response data through override handleDownstreamResponseFirstChunk, then how could i proxy and cache the response data correctly use Riposte?

Many Thanks!

ProxyRouterEndpoints intentionally do not expose request or response payloads. They are chunk-streaming both for the request and response, meaning every chunk that comes into the Riposte server is immediately proxied to its destination and then discarded, allowing it to be immediately garbage collected. This keeps Riposte memory usage stable no matter the size of the payload, and keeps lag time to a minimum. For example, you can have a riposte server act as a router with ProxyRouterEndpoints on a machine with only a few hundred megabytes of heap allocated to the riposte server, and it will successfully proxy a gigabyte payload without any stress to the heap memory, and without incurring the lag time cost of waiting for the entire payload to be collected in memory and then re-sent to its destination.

Your use case is different - you need to pull the payload into memory in order to do something with it. So you will need to use a StandardEndpoint and then make the proxy call with an HTTP client so that you can get the response data and cache it. Riposte has an async non-blocking HTTP client you can use that automatically performs distributed tracing, so if you need or want either of those features then take a look at AsyncHttpClientHelper. The template project has an example of using it here. If you don't need either of those features then you can use the HTTP client of your choice.

StandardEndpoint<I, O> must give O type, but i do not know the O type of the third party endpoints,maybe the O type is a byte array or a file stream and so on, these data can not represent by just one java type(maybe exists a lot of types), I also can not list the return types of all endpoints.
So:
1.What is the best practice to proxy, cache and response these stream(e.g: image,file,video or sound stream) data to the caller?
2.Even if i could list all of the types i must write a lot of StandardEndpoints for all of them, I think this is a bad practice, how to use one StandardEndpoint for all of my proxy?
I just want to use one StandardEndpoint to proxy all of the third party endpoints to return the exist cached data, if the cached data does not exist then call the third party endpoints -> cache response data -> return the cached data.

Look forward to your suggestion.

@VicBell Sorry for the delay.

I thought the javadocs explained the O type in StandardEndpoint<I,O>, but they don't, so sorry about that too. O can be a raw string if you don't want any object serialization performed, and if you don't even want any string encoding performed you can have O be a byte[]. If O is a byte[], then the caller will receive those exact bytes as the response payload with no adjustment at all.

So in your case it sounds like you should set O to be a byte[] which you get from the third party endpoints and cache for later. You'll also need to cache relevant headers from the third party endpoint responses like content-type and set those on the response when you're returning the cached byte[], otherwise the caller won't know how to handle them.

That should let you have one StandardEndpoint<Foo, byte[]> that can handle any payload type (image, file, video, etc), and return the data in such a way that the caller gets the same info they would have gotten if they'd called the third party endpoint(s) directly.

Thanks very much for your suggestions.
I use StandardEndpoint<Void, byte[]> for all of the proxies, and it works fine.
But large file maybe a problem, it would take a lot of memery ,i think #38 is mentioned the same problem .

@VicBell yes, you are correct that large files would be a problem. Even if issue #38 was resolved you wouldn't be able to cache the large files in memory - you'd have to store them on disk and stream them on-demand.

So until issue #38 is resolved you'll have to be careful with large files. You could maybe keep track of how many bytes you're caching total, and when you reach some predefined limit (based on the size of the machine its running on) you could release some stuff from the cache until you're back under a safe amount, maybe an LRU cache. Any future calls for the data that got released would need to be re-proxied, but you'd keep your service stable.

Something like that would probably need to be done anyway even without large files - the large files just make the problem show up more quickly, and a fix for issue #38 just delays the problem.

I do proxy like this:

@Override
public CompletableFuture<ResponseInfo<byte[]>> execute(RequestInfo<Void> request, Executor longRunningTaskExecutor,
		ChannelHandlerContext ctx) {
    logRequest(request);

    DownstreamNode dn = getDownstreamNode(request);
	
    ProxyModel proxyModel = proxyService.loadBySignature(signature(request, dn));
	
    // return cached data
    if(null != proxyModel) {
        logger.debug("Proxy hit,id:{}, sysId:{}, uri: {}, svcName:{}, signature:{}, priority:{}, header: {}"
                            , proxyModel.getId()
                            , proxyModel.getSysId()
                            , proxyModel.getUri()
                            , proxyModel.getSvcName()
                            , proxyModel.getSignature()
                            , proxyModel.getPriority()
                            , proxyModel.getResponseHeaders()
                            );
		
       return CompletableFuture.supplyAsync(() -> {
           List<HashMap<String, String>> responseHeaders = null;
            try {
                responseHeaders = objectMapper.readValue(
                     proxyModel.getResponseHeaders() 
                     , new TypeReference<ArrayList<HashMap<String, String>>>() { }
                );
            } catch (IOException e) {
                logger.error("Header deserilize error", e);
            }
            DefaultHttpHeaders headers = new DefaultHttpHeaders();
            // Set response header by cache
            if(null != responseHeaders ){
              responseHeaders.forEach((map) -> {
                     Set<Map.Entry<String, String>> entrySet = map.entrySet();
                     entrySet.forEach((entry)->{
                            headers.add(entry.getKey(), entry.getValue());
                     });
               });
           }
           return ResponseInfo.<byte[]>newBuilder(proxyModel.getResponse())
                    .withHttpStatusCode(200)
                    .withHeaders(headers)
                    .build();
       }, longRunningTaskExecutor);
    }
	
    // No cached data? then call download stream
    String url = downstreamFullUrl(dn, request);
    RequestBuilderWrapper reqWrapper = asyncHttpClientHelper.getRequestBuilder(url, request.getMethod());

    if (request.getRawContentLengthInBytes() > 0) {
         reqWrapper.requestBuilder.setBody(request.getRawContentBytes());
    }
    
    reqWrapper.requestBuilder.setHeaders(request.getHeaders());
    if(!dn.isProxyHeaderHost()) {
        reqWrapper.requestBuilder.setHeader(Names.HOST, dn.getHost());
    }
	
    if(null != dn.getProxyServer()) {
        reqWrapper.requestBuilder.setProxyServer(buildProxyServer(dn));
    }
    reqWrapper.requestBuilder.setRequestTimeout(dn.getRequestTimeout());
	
    ObjectHolder<Long> startTime = new ObjectHolder<>();
    startTime.heldObject = System.currentTimeMillis();
    
    return asyncHttpClientHelper.executeAsyncHttpRequest(reqWrapper, (asyncResponse) -> {
        logger.info("In async response handler. Total time spent millis: {}",
                    (System.currentTimeMillis() - startTime.heldObject));
        if(logger.isDebugEnabled()) {
        	logger.debug("{} proxy response headers:", url);
        	asyncResponse.getHeaders().forEach((entry) -> {
        		logger.debug("{}:{}", entry.getKey(), entry.getValue());
        	});
        	logger.debug("Proxy response data:{}", asyncResponse.getResponseBody());
        }
        CompletableFuture.runAsync(() -> {
        	saveResonse(request, asyncResponse, dn);
        }, longRunningTaskExecutor);
        
        return ResponseInfo.<byte[]>newBuilder(asyncResponse.getResponseBodyAsBytes())
        		.withHttpStatusCode(asyncResponse.getStatusCode())
        		.withHeaders(asyncResponse.getHeaders())
        		.build();
    }, ctx);
}

In fact,I must read all of the bytes into memery to response to the caller as this:
ResponseInfo.<byte[]>newBuilder(asyncResponse.getResponseBodyAsBytes()) (when no cahe data use a real call and it``s response) or ResponseInfo.<byte[]>newBuilder(proxyModel.getResponse()) (when there is cached data response it to the caller directly).

1.You mean i should not cache the endpoint data which is reachs some predefined limit?(All of the proxy data stored on the disk)
2.But how to kown the data size before i made a real call of AsyncHttpClientHelper (when i made a real call of AsyncHttpClientHelper, it would read all of the data into the memery and will cause OutOfMemeryException) .
3.How to use AsyncHttpClientHelper for chunking write response data to a cahce file instad of read all data into memery?

Look forward the implements of #38.^_^

Unfortunately with Riposte and AsyncHttpClientHelper as it is today there is no way to stream chunks directly to a file. There are plans to create a replacement for AsyncHttpClientHelper that allows optional chunk-streaming for the request and/or the response, but it's not here yet and probably won't be for a few months (at least).

So in the meantime you could search the internet for a different HTTP client that does support chunk-streaming the response, and use that instead of AsyncHttpClientHelper. (If you didn't have the caching requirement you could use ProxyRouterEndpoint for a straight pass-through proxy and since it does chunk-streaming you wouldn't need to worry about memory, but with the caching requirement that won't work.)

In my head I was imagining that you could cache small payloads in an in-memory cache with a predefined total-bytes limit and an LRU policy for when that limit is reached. And big payloads would be cached in a file. What constitutes "big" vs. "small" would be a tradeoff decision you'd have to make, as would the total-bytes limit for the in-memory cache. And it would only work if you left yourself enough memory headroom so that you could pull in the biggest payload into memory without hitting an out-of-memory error. And that requires that you know for sure the size of the biggest payload you could ever receive. If there's no way for you to know that, then the whole idea goes down the drain.

There are a lot of "if"s in that idea which is generally a bad sign. So it sounds like your best bet for right now is to find an HTTP client that supports chunk-streaming the response so you can stream directly to disk. Sorry we don't have a full solution for your use case yet!

Thanks very much for all of your suggestions !