Out of memory when wrong use graphql with dataloader
yanghg-basefx opened this issue · comments
See this code:
class Type2Loader(DataLoader):
def batch_load_fn(self, keys):
# Now, we have 20 thousands keys
return Promise.resolve(range(10000)) # But we only return 10 thousands result
type2_loader = Type2Loader()
class Type2(graphene.ObjectType):
id = graphene.ID()
def resolve_id(self, info):
return self
class Type1(graphene.ObjectType):
id = graphene.ID()
type2 = graphene.Field(Type2)
def resolve_id(self, info):
return self
def resolve_type2(self, info):
return type2_loader.load(self) # Image type2_loader.load will be called 20 thousands times
class Query(graphene.ObjectType):
type1 = graphene.List(Type1)
def resolve_type1(self, info):
return range(20000) # We have 20 thousands results
schema = Schema(query=Query)
Now, if you run this:
query = '''
{
type1 {
id
type2 {
id
}
}
}
'''
schema.execute(query).to_dict()
GraphQL will try to call format_error in to_dict. Most time it's fine, but must note, DataLoader will add all the keys and values in the error message, so if you have a very-very-very long list, It will raise an error message like:
DataLoader must be constructed with a function which accepts Array<key> and returns Promise<Array<value>>, but the function did not return a Promise of an Array of the same length as the Array of keys.
Keys:
[0, 1, 2, 3, 4, 5, all the keys, 20000]
Values:
[(0, '0'), (1, '1'), all the values, [10000, '10000']]
Imagine, to_dict will format it 20, 000 times to fill errors, so it will cost more than 10 GB memory. It shouldn't be happened, though it starts with a wrong code.