aio-libs / aiobotocore

asyncio support for botocore library using aiohttp

Home Page:https://aiobotocore.rtfd.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reusable Initialization of aiobotocore S3 Client

all4one-max opened this issue · comments

Discussed in #1110

Originally posted by all4one-max April 21, 2024
I'm in the process of migrating our Python package, used by various applications within our organization, from boto3 to aiobotocore. In our existing implementation with boto3, we ensure singleton initialization of the S3 client to optimize resource usage and avoid redundant client creation.

Existing Implementation with boto3:

  class AWSMeta(type):
	  _instances: Dict[Any, Any] = {}
  
	  def __call__(cls) -> Any:
		  if cls not in cls._instances:
			  instance = super().__call__()
			  cls._instances[cls] = instance
		  return cls._instances[cls]


  class S3ClientSingleton(metaclass=AWSMeta):
	  _s3_client = None
  
	  @classmethod
	  def get_s3_client(cls) -> Any:
		  if not cls._s3_client:
			  # Lazy initialization of the S3 client
			  cls._s3_client = boto3.client(
				  "s3",
				  aws_access_key_id=AWS_ACCESS_KEY,
				  aws_secret_access_key=AWS_ACCESS_KEY_SECRET,
				  region_name="us-east-1",
			  )
		  return cls._s3_client

However, transitioning to aiobotocore poses a challenge as it requires using context managers for client creation, and the client doesn't exist outside the context. I aim to encapsulate this singleton logic within our package itself, rather than relying on configuration in the application lifecycle.

I've explored several sources, including the aiobotocore documentation and relevant discussions on GitHub and Stack Overflow, but haven't found a satisfactory solution yet.

Sources referred:

  1. https://github.com/aio-libs/aiobotocore
  2. #928
  3. https://stackoverflow.com/questions/77095898/reuse-create-client-in-aiobotocore-for-better-performences (no one answered here but more or less the use case but I have to do this inside my package logic itself)

I tried implementing using the below code but it did not work I was getting Runtime error, Event loop is closed. I am executing operations such as download, upload in my celery task.

class AsyncS3ClientSingleton(metaclass=AWSMeta):
    _async_s3_client = None

    @classmethod
    async def get_s3_client(cls) -> Any:
        if not cls._async_s3_client:
            # Lazy initialization of the S3 client
            session = aioboto3.Session()
            ctx = session.client(
                "s3",
                aws_access_key_id=AWS_ACCESS_KEY,
                aws_secret_access_key=AWS_ACCESS_KEY_SECRET,
                region_name="us-east-1",
            )
            cls._async_s3_client = await ctx.__aenter__()
        return cls._async_s3_client

But when I create the session and client for every operation as shown below it works fine, though it defeats the purpose of reusing the client.

      session = aioboto3.Session()
      async with session.client(
          "s3",
          aws_access_key_id=AWS_ACCESS_KEY,
          aws_secret_access_key=AWS_ACCESS_KEY_SECRET,
          region_name="us-east-1",
      ) as async_s3_client:
          await async_s3_client.upload_file(file_name, bucket, key)

Please help me fix this. I feel it has something to do with session being tied to an event loop and everytime a new celery task is initiated, new event loop is created. Only speculation though