Azure / azure-storage-net

Microsoft Azure Storage Libraries for .NET

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is this library uses http client or http client factory inside this SDK to prevent Socket Exhaustion?

penguinawesome opened this issue · comments

Is this azure storage SDK c# library uses http client or http client factory inside this SDK to prevent Socket Exhaustion?

Good question, not sure if related to this but after moving my API from v9 to v12 I had to revert back due to server crashing for out of memory. I'm collecting some info to fill a proper bug report

@bragma let me know im interested on your bug as well! We have also encountering some performance issues with the v11 library. I wanna know if this library uses http client factory to prevent socket exhaustion

One thing I can share is my app memory footprint over time:

image

The part on the left is after upgrading to SDK v12, then on the right reverting back to v9.

On a larger scale in the next chart, you can see we had the app running with v9 (just a few days shown but it's been the same over months), then moved to v12 (middle of chart), then after a few days back to v9:

image

Same app, same number of requests, etc. We had to go back due to app crashes. Of course, we may have introduced a bug while moving to the new SDK, even if I tend to exclude it.

In any case, an open question is how to handle the new SDK with dependency injection, in particular what kind of lifetime can be used for the various "GetXXXClient". To play it safe, we started by declaring all "scoped" (i.e. web request). This may be the cause of the growing memory footprint, but I didn't see anything helpful in the documentation.

Can "BlobServiceClient" be a singleton? Should it be? What about multi threading?

@bragma Could you please share if possible:

  • exact V9 and exact V12 version?
  • which .NET runtime is that?
  • some approximate data about workload - requests/second, average blob size
  • which API(s) from V9 and V12 are you calling? (best if you could provide code snippet that somewhat illustrate what your app is doing)

Thanks @kasobol-msft. I am using WindowsAzure v9.3.3 and tried to move to Azure.Storage.Blobs v12.7. I am using .NET core 3.1 in ASP.NET project, runtime 3.1.9
I use blobs for storing images upload/download (sizes from 50KB to 2MB), tables for telemetry and queues for sending emails to a separate function app. My API is getting about 10 requests/second, but just a minimum part actively use azure storage. Consider about 200 requests per day for blobs, 400 reqs per day to queues, 200 per day to tables (table writes are batched up to 500 rows per request).
To provide a code snippet I'd have to disclose a lot of my source code. I us a lot of API for creating/deleting blobs, uploading/downloading images from streams to blobs, using async copy from blob to blob, working with metadata, etc. too much to count and I've not been able to reduce the problem to a small snippet.

But first things first, I need to understand how to handle lifetime of various clients with dependency injection. I have a "Storage" class which is passed to my controllers constructors (so it happens on every request to my API). In v9 this just involves parsing a connection string and getting a CloudStorageAccount object. This does not seem to have any adverse effect, b12 API is in fact different and the direct replacement is to create a BlobServiceClient object from the connection string. Not sure if this is a problem, but in any case I am expecting this to just be garbage collected, in particular if the service client is not used in a specific controller endpoint.

@bragma It indeed sounds like lifecycle of storage clients is an issue here. Please refer to this article https://devblogs.microsoft.com/azure-sdk/lifetime-management-and-thread-safety-guarantees-of-azure-sdk-net-clients/ and scope storage clients as singletons.

@kasobol-msft thanks for pointing me out the article, it is very interesting. Shouldn't this info be present in the official documentation instead of a blog article?
Still I am a bit uncertain about how to use the various clients provided by the SDK. In my code I create some BlobServiceClient instances from different connection strings, but then I use those to create BlobContainerClients (using serviceClient.GetBlobContainerClient) and then again lots of BlobClients from BlobContainerClients (via containerClient.GetBlobClient) and passing clients around as parameters of my functions. This allows me to pass validated "references" to blobs, instead of strings containing blob names.

Is this a viable way of using clients? Are clients build from the various GetXXXClient safe to use in this way?

The GetXXXClient are "lightweight", i.e. they inherit most of internal resources (i.e. http pipeline and it's policies, auth and so on). The amount of garbage generated by obtaining clients this way should be minimal.
So having BlobServiceClient as singleton (or any long lifespan that makes sense for the app) and deriving clients from it is recommended approach.

@kasobol-msft does this also applies to the v11 of the azure storage SDK, where it is recommended to treat the client as singleton and reuse for the entire lifecycle of the web app?

This is the code that we are using v11.1.7:

// Retrieve storage account from connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("connection string");

        // Create the blob client.
        CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

Then reuse the blobClient for the entire lifecycle?

@firephantomassasin For the V11 and lower please see here and here.