graphql-python / graphql-core-legacy

GraphQL base implementation for Python (legacy version – see graphql-core for the current one)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Out of memory when wrong use graphql with dataloader

yanghg-basefx opened this issue · comments

See this code:

class Type2Loader(DataLoader):
    def batch_load_fn(self, keys):
        # Now, we have 20 thousands keys
        return Promise.resolve(range(10000))  # But we only return 10 thousands result


type2_loader = Type2Loader()


class Type2(graphene.ObjectType):
    id = graphene.ID()

    def resolve_id(self, info):
        return self


class Type1(graphene.ObjectType):
    id = graphene.ID()
    type2 = graphene.Field(Type2)

    def resolve_id(self, info):
        return self

    def resolve_type2(self, info):
        return type2_loader.load(self)  # Image type2_loader.load will be called 20 thousands times


class Query(graphene.ObjectType):
    type1 = graphene.List(Type1)

    def resolve_type1(self, info):
        return range(20000)  # We have 20 thousands results

schema = Schema(query=Query)

Now, if you run this:

query = '''
{
  type1 {
    id
    type2 {
      id
    }
  }
}
'''
schema.execute(query).to_dict()

GraphQL will try to call format_error in to_dict. Most time it's fine, but must note, DataLoader will add all the keys and values in the error message, so if you have a very-very-very long list, It will raise an error message like:

DataLoader must be constructed with a function which accepts Array<key> and returns Promise<Array<value>>, but the function did not return a Promise of an Array of the same length as the Array of keys.

Keys:
[0, 1, 2, 3, 4, 5, all the keys, 20000]

Values:
[(0, '0'), (1, '1'), all the values, [10000, '10000']]

Imagine, to_dict will format it 20, 000 times to fill errors, so it will cost more than 10 GB memory. It shouldn't be happened, though it starts with a wrong code.