nginxinc / nginx-s3-gateway

NGINX S3 Caching Gateway

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Help] Wanted to know the volume of data nginx-s3-gateway can handle

akhilputhiry opened this issue · comments

We have buckets with several terabytes of data
Wanted to know whether some benchmarking is done by someone

Hello,
We haven't done any formal benchmarking on this project as far as I know (my co-maintainer may have earlier in the project and can correct me when they get back from vacation).

There will be a small amount of overhead from using the library on each request for a non-cached file. However, after the authorization steps are taken care of, the request is simply routed to your s3 bucket using NGINX's proxy_pass directive which has a pretty well understood performance profile.

Here's a link to an open source benchmark that uses the directive as part of its tests. I have not researched the quality of that particular benchmark but it should give you some idea.

After that, standard S3 access characteristics should apply. Requests for cached files if you configure caching in NGINX should be much faster.

Unless you're attempting to list out all your files using the index page feature, the size the bucket should not matter from the perspective of this project.

Configuring NGINX for caching and performance will be specific to your workload and hardware. Here's a basic guide we published a while back that I found helpful in understanding NGINX tuning.

Hi there,

The data size of the bucket should not present a problem for performance. The biggest consumer of resources will be the number of times a AWS HTTP signature is recalculated. This would mean that a system that had many small objects with an access pattern where the same objects were not being frequently accessed would see the biggest performance issues. However, for most use cases even then, I do not imagine it being a big problem. Moreover, you can scale out to running multiple instances of nginx.

If you wanted to run a proper performance benchmark, I would suggest that you use COSBench for the benchmark and write a custom adapter that extends the S3 adapter. You should be able to reuse the write portion of the S3 adapter, but you will need to rewrite the portion that accesses the object such that it can point to the NGINX S3 Gateway.