terricain / aioboto3

Wrapper to use boto3 resources with the aiobotocore async backend

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

S3 streaming API change

Nifury opened this issue · comments

commented
  • Async AWS SDK for Python version: 9.6.0
  • Python version: 3.8
  • Operating System: Windows/Linux

Description

After updating to 9.6.0, the S3 streaming download code no longer works.

What I Did

async with s3_ob["Body"] as stream:
    file_data = await stream.read(chunk_size)

throws TypeError(): read() takes 1 positional argument but 2 were given.

s3_ob["Body"] returns ClientResponse in aiohttp, and its read function no longer takes any argument.
We might need to use file_data = await stream.content.read(chunk_size).

Can confirm the problem

async with result["Body"] as stream:
    async for chunk, _ in stream.iter_chunks():
        ...

throws an error since iter_chunks() does not exist.

Python version: 3.8.10
aioboto3 version: v9.6.0

temporary fix is to fix the aioboto3 version so v9.5.0

@Nifury
There is an easy workaround backed by aiobotocore unit tests (check it out here: link). Compare test_get_object_stream_wrapper using chunks and no context manager and test_get_object_stream_context using context manager.

Digging deeper, aiohttp ClientResponse (link) shows that probably added value from async with is closing the stream. test_get_object_stream_wrapper from aiobotocore tests also has an explicit close() call.

So based on those 2 links I'd suggest the following options:

  • No chunks + with:
async with s3_ob["Body"] as stream:
    file_data = await stream.read()
  • Chunks + explicit close:
stream = s3_ob["Body"]
try:
    # probably also a loop somewhere here
    file_data = await stream.read(chunk_size)
finally:
    s3_ob["Body"].close()

(disclaimer: this is a simplified version, did not run that exact code snippet)

Hope that helps

commented

Thanks! It seems that s3_ob["Body"] returns StreamingBody which implements read(amt), but its __aenter__ function returns the underlying ClientResponse.
Mystery solved!

This issue has been addressed in release 10.0.0.

commented

This issue has been addressed in release 10.0.0.

I updated to 10.0.0 but the issue still exists.

Did you get an update for aiobotocore to 2.3.4 as well? In the case of the issue I was experiencing, the fix in aiobotocore 2.3.4 worked for me. Admittedly, the problem I ran into is not identical to yours but produced similar errors which I believed to be related. Apologies for not clarifying that assumption.

The change utilized something similar to the possible fix you suggested when you opened the ticket.

It changes

async for chunk, _ in self._raw_stream.iter_chunks():

to

async for chunk, _ in self._raw_stream.content.iter_chunks():

commented

Yes, the aiobotocore version is 2.3.4. Unfortunately the change doesn't fix this issue.

I don't think it's a big deal though, I was only suggesting updating the document. 😁