masterzen / nginx-upload-progress-module

Nginx module implementing an upload progress system, that monitors RFC1867 POST uploads as they are transmitted to upstream servers.

Home Page:http://wiki.codemongers.com/NginxHttpUploadProgressModule

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using module in cluster architecture (many upload nodes behind LB)

seletskiy opened this issue · comments

I'm using upload-progress module in cluster architecture, e.g. I have many upload servers behind load balancer (TCP level). And there is a problem, when I make upload request, I know only load balancer address, not upload server address. And there no way to get upload progress of that server, that handle current upload.

Example: 1 LB, 3 Upload Server (A, B, C). When I do upload request, it handled with, for example, server A (choosen by LB). But when I do upload progress request, it handled by server B, next request by server C, then by A (round robin manner). Obviously, I've got 2 error messages (upload ID is not found) and only one correct answer.

So, I want a mechanism to get upload progress correctly. There is native way to do this in nginx: proxy_next_upstream directive, which can pass request to another server when current server answers with bad HTTP code.

There is my suggestion: add some way to return HTTP code from upload-progress module to make it possible to handle answer with proxy_next_upstream directive.

There's an my patch and example configuration: seletskiy/nginx-upload-progress-module@42b6b60d

What I usually recommend is to configure the LB to stick to a back-end based on the X-Progress-ID.

I had a look to your configuration example (I'm not familiar with proxy_next_upstream) and I don't understand how it can work. Is the configuration the one on the LB (assuming it is nginx and not something else) or is it one of your upstream server configuration?
If it's the config of the LB, I fail to see how it can try more than 2 upstreams.
Can you elaborate?
Have you tried this patch on a real infrastructure?

Thanks for answer.

I couldn't configure LB that way you propose, because of it balances packets
purely on TCP layer, without any knowledge about protocol.

Configuration example file I suggest is placed on all upload nodes behind LB, and this configuration is same for all nodes.

Each of upload nodes have upstream that contains all other upload nodes except this one.

My scheme works this way:

  1. Upload progress request come to first upload server;
  2. Upload progress module checks upload, and, if it not found, returns 404 Not Found HTTP code;
  3. That error code handled by error_page directive, so request is passed to @fallback location.
  4. Next, in @fallback location, proxy_pass directive takes control: it passes current request (upload progress request) to specified upstream;
  5. If server in upstream answers with 'good' HTTP code (200 OK), than answer returned to client;
  6. If server in upstream answers with 404 Not Found, then proxy_next_upstream transfers request to next server in upstream, after that answer checked by 5-6 rules again.

If proxy_next_upstream handler sees configured http code in the answer (http_404 in my example), that it passes request to next server in upstream in round robin manner.

I tested patch on a test bench of several machines, not on real highly loaded project, but intend to.

Any thoughts?

Your solution is interesting.
So if I understood correctly, every upload server has other upload server in potential upstreams, to which it can forward a probe request.
Let's imagine you have 10 upload servers, the worst case is that for only one probe and very bad luck, this probe will be proxied to every upload servers. This can create an artificial load and worsen the latency of the answer.
I don't really have any generic solution for this problem, but I have the feeling that trying to solve this at the upload progress level is not the right way.
I'll think about that and see what we can do.

Yep, you absolutely right.

Proposed solution is not a right way to resolve a problem. But I don't see easy graceful solution for now, so, I try to use this solution in production.

BTW, there are some more complicated ways:

  1. Add one another server between LB and US (upload servers), that be a central storage for X-Progress-Id's, and all progress requests will be proxied through this server to correct node;
  2. Each node will have knowledge about uploads started at another nodes, so if node have no information about requested id, it proxy request next to upstream that handle corresponding upload;

I think I found a solution (somewhat obvious one). I will implement it soon and send patch to your review.

I think there should be a directive, called, for example, upload_host_id $hostid. When specified, module should return field host_id = $hostid in answer to the progress request.
So, next time when client will request upload status, it can provide received $hostid in it's next request and load balancer can use this info to redirect request to specific upload server.

Example configuration:

Upload Node A:

location /progress/ {
    report_uploads uploads-zone;
    upload_host_id node_a;
}

Upload Node B:

location /progress/ {
    report_uploads uploads-zone;
    upload_host_id node_b;
}

Load Balancer:

location /progress/node_a/ {
    proxy_pass node_a;
}

location /progress/node_b/ {
    proxy_pass node_b;
}

any idea finally solve the problem?

Closing this issue since it's awfully outdated.