Streaming file upload is not working
myscfwork opened this issue · comments
from gql import gql, Client
from gql.transport.aiohttp import AIOHTTPTransport
import aiofiles
gql_query = gql('''
mutation create($input: ProductDocumentCreateMutationInput) {
createProductDocument(input: $input) {
productDocument{
id
}
}
}
query getProductDoc($id: ID!) {
productDocument(id: $id) {
id
file_category
file
}
}
''')
async def stream_file(filepath):
async with aiofiles.open(filepath, "rb") as f:
while True:
chunk = await f.read(64 * 1024)
if not chunk:
break
yield chunk
transport = AIOHTTPTransport(
url='url',
headers={"Authorization": f"Bearer {auth_token}"},
)
async with Client(transport=transport, fetch_schema_from_transport=True) as gql_session
filepath = 'path/to/product_doc.pdf'
data = {
"user": 'user_id',
"file_category": 'product_doc'
"file": stream_file(filepath),
}
result = await gql_session.execute(
gql_query,
operation_name="create",
variable_values={"input": data},
upload_files=True,
)
In above I am trying to stream file upload as shown in the gql
documentation. But I get following error
[ERROR] Exception: Failed to upload {'message': 'Must provide query string.'}
Traceback (most recent call last):
File "file_upload.py", line 143, in main
result = await gql_session.execute(
File "/usr/local/lib/python3.10/site-packages/gql/client.py", line 1231, in execute
raise TransportQueryError(
gql.transport.exceptions.TransportQueryError: {'message': 'Must provide query string.'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "file_upload.py", line 148, in main
raise Exception(f"Failed to upload {exp}")
Exception: Failed to upload {'message': 'Must provide query string.'}
System info:
- OS: Ubuntu 22.04
- Python version: 3.10
- gql version: 3.4
- graphql-core version: 3.2.3
- Does it work with a normal file upload without streaming?
- Could you try with only the
create
mutation inside the gql method call - Please post the relevant part of the schema (ProductDocumentation, productDocumentation, ...)
- Please post the debug logs
- Please try expanding the input argument in the mutation. Something like this:
mutation create($file: Upload!, $file_category: String, $user_id: $ID) {
productDocumentation(input: {file: $file, user: $user_id, file_category: $file_category}) {
id
}
}
I have enabled bebug logs. Expanding the input argument does not work because create mutation only accepts input as ProductDocumentCreateMutationInput. I have shared the schema below. Also file upload works when only create
mutation is used inside gql method without streaming. But it fails when multiple queries are used inside the gql call or when streaming is used.
So I tried following 3 things:
- File upload works when single query is used without streaming file upload
gql_query = gql('''
mutation create($input: ProductDocumentCreateMutationInput) {
createProductDocument(input: $input) {
productDocument{
id
}
}
}
''')
file = io.BytesIO(open(filepath, "rb").read())
file.name = attachment.name
data = {
"user": 'user_id',
"file_category": 'product_doc'
"file": file,
}
result = await gql_session.execute(
gql_query,
operation_name="create",
variable_values={"input": data},
upload_files=True,
)
Log below:
operations {"query": "mutation create($input: ProductDocumentCreateMutationInput!) {\n createProductDocument(input: $input) {\n errors {\n field\n message\n }\n productDocument {\n id\n }\n }\n}", "operationName": "create", "variables": {"input": {"user": "UmRUZXN0VHlwZTplZTQ1ZjNhMC0zNWM5LTRjMWUtOTZjZS1kYjExNjRhMjIxN2U=", "file_category": "product_doc", "file": null}}}
04:57:56file_map {"0": ["variables.input.file"]}
04:57:56<<< {"data":{"createProductDocument":{"errors":[],"productDocument":{"id":"UmRUZXN0QXR0YWNobWVudFR5cGU6MTJmOTVjYTAtMjNkYS00NjYwLThhZDAtNGFhNGE4OWRkOTc5"}}}}
- When multiple queries are added and same request is done as above, it gives error
gql_query = gql('''
mutation create($input: ProductDocumentCreateMutationInput) {
createProductDocument(input: $input) {
productDocument{
id
}
}
}
# added another query
query getProductDoc($id: ID!) {
productDocument(id: $id) {
id
file_category
file
}
}
''')
Error Logs below:
operations {"query": "mutation create($input: ProductDocumentCreateMutationInput!) {\n createProductDocument(input: $input) {\n errors {\n field\n message\n }\n productDocument {\n id\n }\n }\n}\n\nquery productDocument($id: ID!) {\n productDocument(id: $id) {\n id\n }\n}", "operationName": "create", "variables": {"input": {"user": "UmRUZXN0VHlwZTplZTQ1ZjNhMC0zNWM5LTRjMWUtOTZjZS1kYjExNjRhMjIxN2U=", "file_category": "product_doc", "file": null}}}
file_map {"0": ["variables.input.file"]}
<<< {"errors":[{"message":"Must provide operation name if query contains multiple operations."}]}
Closing transport
[ERROR] Exception: Failed to upload {'message': 'Must provide operation name if query contains multiple operations.'}
Traceback (most recent call last):
File "file_upload.py", line 145, in main
result = await gql_session.execute(
File "/usr/local/lib/python3.10/site-packages/gql/client.py", line 1231, in execute
raise TransportQueryError(
gql.transport.exceptions.TransportQueryError: {'message': 'Must provide operation name if query contains multiple operations.'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "file_upload.py", line 150, in main
raise Exception(f"Failed to upload {exp}")
Exception: Failed to upload {'message': 'Must provide operation name if query contains multiple operations.'}
- When only create mutation query and stream file uploading are used:
gql_query = gql('''
mutation create($input: ProductDocumentCreateMutationInput) {
createProductDocument(input: $input) {
productDocument{
id
}
}
}
''')
Error Logs below:
operations {"query": "mutation create($input: ProductDocumentCreateMutationInput!) {\n createProductDocument(input: $input) {\n errors {\n field\n message\n }\n productDocument {\n id\n }\n }\n}", "operationName": "create", "variables": {"input": {"user": "UmRUZXN0VHlwZTplZTQ1ZjNhMC0zNWM5LTRjMWUtOTZjZS1kYjExNjRhMjIxN2U=", "file_category": "product_doc", "file": null}}}
file_map {"0": ["variables.input.file"]}
<<< {"errors":[{"message":"Must provide query string."}]}
Closing transport
[ERROR] Exception: Failed to upload {'message': 'Must provide query string.'}
Traceback (most recent call last):
File "file_upload.py", line 145, in main
result = await gql_session.execute(
File "/usr/local/lib/python3.10/site-packages/gql/client.py", line 1231, in execute
raise TransportQueryError(
gql.transport.exceptions.TransportQueryError: {'message': 'Must provide query string.'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "file_upload.py", line 150, in main
raise Exception(f"Failed to upload {exp}")
Exception: Failed to upload {'message': 'Must provide query string.'}
Relevant Schema:
input ProductDocumentCreateMutationInput {
clientMutationId: String
file: MultiFileScalar
user: ID!
file_category: ProductDocumentTypeChoices
}
Expanding the input argument does not work because create mutation only accepts input as ProductDocumentCreateMutationInput.
It should always be possible to expand input types into GraphQL basic types and scalars.
I don't think it's going to solve your problem but you should be able to use a query like this:
mutation create($file: MultiFileScalar, $file_category: ProductDocumentTypeChoices, $user_id: ID!) {
createProductDocument(input: {file: $file, user: $user_id, file_category: $file_category}) {
productDocument {
id
}
}
}
with the variable_values
containing now directly the variables instead of the single input
variable.
That being said, your problem seems quite strange.
- which backend are you using? Is it publicly available?
- I noticed that the file scalar is named MultiFileScalar. What happens if you provide a list of files instead of a single file?
Could you share a working code for streaming file upload if you have?
aiohttp
has the possibility to use FormData() and set filename for the uploaded file:
https://docs.aiohttp.org/en/stable/client_quickstart.html#post-a-multipart-encoded-file
data = FormData()
data.add_field("user", 'user_id', content_type="multipart/form-data")
data.add_field("file", open("filepath", rb), filename="example.zip", content_type="multipart/form-data")
With gql
, following format works but the filename is set as the whole file path i.e /home/username/...
. Is it possible to set the filename in the following request?
data = {
"user": 'user_id',
"file_category": 'product_doc'
"file": open('filepath', 'rb),
}
result = await gql_session.execute(
gql_query,
operation_name="create",
variable_values={"input": data},
upload_files=True,
)
You can find some examples in the tests/test_aiohttp.py file. Search for upload_files
to find the relevant tests.
You can run specific tests by running a pytest command like this:
pytest tests/test_aiohttp.py::test_aiohttp_file_upload -s
Could you share a working code for streaming file upload if you have?
Check out the test_aiohttp_async_generator_upload test.
aiohttp
has the possibility to use FormData() and set filename for the uploaded file: https://docs.aiohttp.org/en/stable/client_quickstart.html#post-a-multipart-encoded-file
That is what we are doing. gql is open-source you know, you can check the code.
With
gql
, following format works but the filename is set as the whole file path i.e/home/username/...
. Is it possible to set the filename in the following request?
gql uses the name
parameter of the provided file object if it is present for the filename
parameter.
I thought you could do something like:
f = open('filepath', 'rb)
f.name = "your_name"
data = {
"user": 'user_id',
"file_category": 'product_doc',
"file": f,
}
but in that case I got the error:
AttributeError: attribute 'name' of '_io.BufferedReader' objects is not writable
So one way to change the filename would be what you did above, even if it's a bit inefficient:
file = io.BytesIO(open(filepath, "rb").read())
file.name = "your_name"
In the case of the streaming uploads, we have the same kind of problem.
Doing something like this:
async_generator = file_sender(file_path)
async_generator.name = "your_name"
would generate the following error:
AttributeError: 'async_generator' object has no attribute 'name'
We can get around by making a new class inheriting AsyncGenerator
this but it's kind of hackish:
class NamedAsyncGenerator(collections.abc.AsyncGenerator):
name = None
inner_generator = None
def __init__(self, inner_generator: collections.abc.AsyncGenerator, name=None):
self.inner_generator = inner_generator
self.name = name
def asend(self, val):
return self.inner_generator.asend(val)
def athrow(self, typ, val):
return self.inner_generator.athrow(typ, val)
that you would use like this:
async def file_sender(file_name):
async with aiofiles.open(file_name, "rb") as f:
chunk = await f.read(64 * 1024)
while chunk:
yield chunk
chunk = await f.read(64 * 1024)
async_generator = file_sender(file_path)
named_async_generator = NamedAsyncGenerator(async_generator, "your_filename")
data = {
"user": 'user_id',
"file_category": 'product_doc',
"file": named_async_generator,
}
I agree that's it's not really clean and we should consider changing the interface to make this simpler.
@leszekhanusz FYI, your workaround gives error
gql.transport.exceptions.TransportQueryError: {'message': 'Must provide query string.'}
.
Whenever I add the generator "file": named_async_generator
in the input parameter, it gives this error. But it works when I use it like this "file": io.BytesIO(open(filepath, "rb").read())
. Unfortunately I upload large files and need to use the generator which is not working.
That error message is coming from the backend.
If you still think there is a problem with gql, either provide a public backend showing the issue, or add another test to gql showing the issue.