aio-libs / aiobotocore

asyncio support for botocore library using aiohttp

Home Page:https://aiobotocore.rtfd.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

aiobotocore blocks the event loop with I/O in several locations

ekzhang opened this issue · comments

Describe the bug

Hi, we've been using aiobotocore through aioboto3, a thin wrapper, in an environment to access S3 resources:

session = aioboto3.Session()
async with session.client("s3") as s3:
    pass

This code directly calls into aiobotocore. I noticed through runtime profiling that on a server in production, aiobotocore blocks the main executor thread, preventing any other tasks from running in the asyncio event loop. Here is an example stack trace (aiobotocore v2.4.2) that points to a line of code that loads SSL locations from the file system, using blocking I/O on the main thread:

Thread 1 (active): "MainThread"
    __init__ (aiobotocore/httpsession.py:109)
    __init__ (aiobotocore/utils.py:38)
    __init__ (aiobotocore/utils.py:92)
    create_credential_resolver (aiobotocore/credentials.py:91)
    _create_credential_resolver (aiobotocore/session.py:51)
    get_component (botocore/session.py:1081)
    get_credentials (aiobotocore/session.py:80)
    _create_client (aiobotocore/session.py:169)
    __aenter__ (aiobotocore/session.py:26)

ca_certs = get_cert_path(verify)
if ca_certs:
ssl_context.load_verify_locations(ca_certs, None, None)

From more runtime profiling of the S3 provider on aiobotocore v2.4.2, besides line 109 of httpsession.py that reads SSL contexts, I noticed some stack traces for locations in aiobotocore that also run blocking I/O in the main thread, also inside _create_client:

  • Again inside _create_client, on line 170, it calls _get_internal_component on the session

    endpoint_resolver = self._get_internal_component('endpoint_resolver')
    exceptions_factory = self._get_internal_component('exceptions_factory')
    config_store = self.get_component('config_store')
    defaults_mode = self._resolve_defaults_mode(config, config_store)

    which goes into the botocore package and eventually ends up calling open(file).read() on the main thread:

    https://github.com/boto/botocore/blob/e965ce6f0d68ab8108b307ce10dbf198dd941f20/botocore/loaders.py#L171-L179

  • Inside client.py's create_client function, on line 47, it calls _load_service_model from botocore, which eventually calls into the same function from botocore as above that uses blocking I/O to read JSON configuration from the file system.

    service_name = first_non_none_response(responses, default=service_name)
    service_model = self._load_service_model(service_name, api_version)
    cls = await self._create_client_class(service_name, service_model)

The first blocking code though, which reads SSL contexts on aiobotocore/httpsession.py:109, seems to occur in the most places though, since it's also called downstream of functions like create_credential_resolver(), Session._create_client(), and AioRequestSigner.sign(), which is used in S3 methods involved presigned URLs.

Checklist

  • I have reproduced in environment where pip check passes without errors
  • I have provided pip freeze results
  • I have provided sample code or detailed way to reproduce
  • I have tried the same code in botocore to ensure this is an aiobotocore specific issue
  • I have tried similar code in aiohttp to ensure this is is an aiobotocore specific issue
  • I have checked the latest and older versions of aiobotocore/aiohttp/python to see if this is a regression / injection

pip freeze results

aioboto3==10.4.0
aiobotocore==2.4.2
aiohttp==3.8.4
aioitertools==0.11.0
aiosignal==1.3.1
async-timeout==4.0.2
attrs==23.1.0
botocore==1.27.59
charset-normalizer==3.2.0
frozenlist==1.3.3
idna==3.4
jmespath==1.0.1
multidict==6.0.4
python-dateutil==2.8.2
six==1.16.0
typing-extensions==4.7.1
urllib3==1.26.16
wrapt==1.15.0
yarl==1.9.2

Environment:

  • Python Version: 3.9
  • OS name and version: Ubuntu 20.04

there isn't a core async file system concept yet in python, I don't think we're going to spawn threads to fix this

Makes sense, thanks for the reply. I understand that there are tradeoffs, and I appreciate your attention to simplicity in the project.

If it helps bring some color to others though, we ended up not using aiobotocore due to the event loop-blocking issue since it caused large tail latency issues (>300 ms) in our server when requests would be queued up that each request clients. We saw our tail latencies drop more than 20x immediately after replacing it with just boto3 operations on another thread.

those file costs are just once per session/client IIRC, you probably can fix them by keeping your session/client alive for the life of your application

note that clients are heavy weight as each client is associated with a connection pool

@ekzhang : On an off topic, can you please share how you are doing runtime profiling of your python code ?

ok ya, your session and client should be long-lived, that's just a fact of life with botocore. You don't want to keep re-creating the connection pools of ssl contexts

ok ya, your session and client should be long-lived, that's just a fact of life with botocore. You don't want to keep re-creating the connection pools of ssl contexts

For one more perspective, I just bumped into this same issue with a distributed task queue system with multiple workers. For architectural reasons I have to initialize aiobotocore clients per task (we don't know the credentials until the task runs). The code:

async with session.create_client("s3") as s3

blocks the main event loop for ~5 to 10 seconds. I don't mind waiting the seconds on every task startup, but blocking the main event loop causes all sorts of heartbeat timeouts on the workers.

Running the blocking calls @ekzhang mentioned using asyncio.to_thread() or something would solve this for my use case. I'm trying to decide now if there's a way to hack around this (maybe some kind of session caching system per worker) or if I also need to switch to regular botocore in my own threads.

File IO blocking aside, aiobotocore client init seems slower than botocore in general:

import time

import aiobotocore.session
import boto3


async def main():
    t1 = time.perf_counter()
    async with aiobotocore.session.get_session().create_client("s3") as s3:
        pass
    t2 = time.perf_counter()

    print("aiobotocore elapsed:", t2 - t1)

    t1 = time.perf_counter()
    s3 = boto3.session.Session().client("s3")
    t2 = time.perf_counter()

    print("boto3 session elapsed:", t2 - t1)

    t1 = time.perf_counter()
    s3 = boto3.client("s3")
    t2 = time.perf_counter()

    print("boto3 client elapsed:", t2 - t1)


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

output:

$ py dev\aioboto_speeds.py
aiobotocore elapsed: 2.447070299880579
boto3 session elapsed: 1.494525299873203
boto3 client elapsed: 1.3479881000239402

In my real world task worker usage I saw an even bigger difference, going from around ~4 seconds with aiobotocore to ~0.5 seconds with asyncio.to_thread(boto3.client())

I can open a new issue if you think this is worth looking into. That time really adds up in a concurrent task queue situation (especially when the main even loop blocks too) but I understand if you feel like that's acceptable for what should only be a once-per-process client session.