graphql-python / graphql-core-legacy

GraphQL base implementation for Python (legacy version – see graphql-core for the current one)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Option to not catch exceptions

reverie opened this issue · comments

Please add an option to not catch all exceptions. It makes things needlessly difficult to debug, and contravenes the Zen of Python.

It would also be sufficient to document that this is occuring, because workarounds exist:

graphql-python/graphql-core#41

But you did already mention that here:

https://github.com/graphql-python/graphql-core/issues/238

So I +1 you'd there since it seems like a better solution to me. I'm not exactly sure why the core is opinionated about catching all exceptions, but I imagine there must be a good reason even if it was a historical one that no longer applies.

Can you add some code to demonstrate the exact problem? I would like to check if the same problem exists in graphql-core/modern branch or graphql-core-next or if this is specific to graphql-core.

@Cito Ok, I made a demo project

https://github.com/reverie/graphq_core_demo

Screen Shot 2019-04-04 at 5 02 15 PM
Where is this error happening?

@reverie Thanks. But you are using Graphene and Graphene-Django in this demo. The error catching could also happen at these levels, not down at graphql-core. We need an example that uses only graphql-core.

Also, your example shows that the error itself is caught. What you seem to be missing is the traceback.

So I took your example down to the graphql-core level:

from graphql import (
  GraphQLObjectType, GraphQLField, GraphQLString, GraphQLSchema, graphql)


def a():
  return b()

def b():
  return c()

def c():
  return d()

def d():
  return a + 3


def make_foo(*_args):
  return a()


schema = GraphQLSchema(
  query= GraphQLObjectType(
    name='RootQueryType',
    fields={
      'foo': GraphQLField(
        GraphQLString,
        resolver=make_foo
      )
    }
  )
)

result = graphql(schema, '{foo}')

raise result.errors[0]

What I observe is that the error is printed even twice, with the correct traceback showing funcion d() as causing the error . The traceback is printed twice because the error is raised and graphql-core also logs it to the standard output. I also checked with graphql-core-next, which does not log but also returns the proper traceback attached to the error.

So actually graphql-core isn't suppressing any error information. The issue is probably on the level of graphene-django. I would not expect it to return the traceback in the response, revealing that much server side info to any client would be a security issue. Showing just the error message here is the right thing. But it should show the traceback in the server log.

Ok, now I really got to the bottom of it. The above code uses a synchronous resolver. Let's change this to return a promise, like this:

def make_foo(*_args):
  return Promise(lambda resolve, reject: resolve(a()))

Now suddenly the traceback is not printed any more. And this is exactly what happens in graphene-django, because the django debug middleware returns promises.

In the case of promises, the error is not raised directly and we need the stack attribute of the error to see the traceback, which is not set properly in GraphQLLocatedError. This line is the problem:

    stack = original_error and getattr(original_error, "stack", None)

Normal Python errors like the one raised by the resolver do not have a stack attribute. We must use the __traceback__ attribute instead:

    stack = original_error and (
        getattr(original_error, "stack", None) or
        getattr(original_error, '__traceback__', None))

After modifying GraphQLLocatedError like this, the error traceback is now shown properly in the Django server log.

There's just one caveat here: The __traceback__ attribute is only available in Python 3. It would be much more difficult if not impossible to implement in Python 2. That's sad because graphql-core is supposed to work with Python 2 as well, but we probably need to accept it as a known bug.

@reverie can you check if #240 solves your problem? After applying the patch, you should see the location of the error in the console where you start your Django server.

@Cito Thanks for all your work on this! I tried the patch in #240 and it's a big improvement. I think it would help a lot of people if it were merged. That change does not directly solve this issue (#237), though: I don't want graphql-core to catch the exception in the first place. (Why does it do that?)

I don't want graphql-core to catch the exception in the first place. (Why does it do that?)

Because we don't want to halt the whole GraphQL server just because one user sent a failing query.

I don't want graphql-core to catch the exception in the first place. (Why does it do that?)

Because we don't want to halt the whole GraphQL server just because one user sent a failing query.

That makes sense! From Django's perspective graphql-core is just a library, though, and it would be helpful to support that perspective as well.

The idea of catching the errors and returning them as a list is pretty much baked in GraphQL per the specs (see here for example). That's why it is implemented that way in GraphQL.js which graphql-core replicates.

Note that it is possible to re-raise the errors that are returned by GraphQL with the same stacktrace if you really want that.

How about two modes. During development let errors optionally break out of graphene. This will allow normal usage of the debugger. At the moment, I have to place a debug break point inside the graphene codebase in order to get a live stack on exceptions.

The other option would be to log errors via standard python logging module, before returning the error list to the user. This would work seamlessly with any framework like Django. And it also lets end users hook up custom logging like Sentry, without having to special case graphene.

The logging happens already through the logging module. What you see on the console are actually log messages that you can send anywhere you want, configurable via the Django configuration.

If you really want to re-raise errors, then you need to do this on the level of Graphene-Django, because anything that Graphene throws is caught by Graphene-Django anway and wrapped into a ExecutionResult in execute_graphql_request(). One idea would be to use a custom GraphQLView that has a format_error method that throws or does some extra logging or filtering of error messages. Every error message is seen by that method.

@Cito That makes sense. Thank you for your help with this!

Related: #41, #202, #209 (Just to keep track of issues since I've came to them inbound on google)

I'm still a bit overwhelmed by all the information.

It's difficult to narrow down the chain of responsibility (edit: In regards to exception handling, logging, promises, etc). It'd be nice to see documentation/recipes that are authoritative and up to date that we could cite as a best practice, which we could in turn take downstream to graphene -> graphene-django/flask-graphql, etc.

We're using graphene-django/flask-graphql and it can be tricky to figure out how to properly get errors to propagate (we'd like to see them 1. Sent to sentry if production, so we'd want the original exception, and 2. Some sort of traceback to stdout/stderr)

Edit: I mean this with all due respect to staying in scope (to this project), I think there's a here-and-now issue that it's really easy to blur / distinguish graphql-core / graphene / flask and django frontends. At least that's just me 😆

I don't want graphql-core to catch the exception in the first place. (Why does it do that?)

Because we don't want to halt the whole GraphQL server just because one user sent a failing query.

I don't agree that this is a valid need - an exception in a Django view doesn't halt the whole Django server. Instead the server replies with HTTP 500 and continues handling requests. Error handling is explicitly managed by Django.

In cases where graphql-core is being used with other frameworks, I would expect it to behave as other contributing libraries, handling only the graphql-related tasks and leaving things like error-handling to the package designed for it.

Having a contributing package like graphql-core take over all exception handling may be by design, but it is unexpected by developers and undesirable when working with popular frameworks.

I'd be delighted to see a way to switch error-handling in graphql-core off.

Instead the server replies with HTTP 500 and continues handling requests.

In GraphQL the server still sends a HTTP 200 with the error as part of the execution result in JSON format. This is only unexpected for developers who are accustomed to REST, but the expected behavior for GraphQL.

handling only the graphql-related tasks and leaving things like error-handling to the package designed for it

Again. error handling is a GraphQL-related task per the specs.

And the goal of graphql-core is to just implement the GraphQL specs, replicating what GraphQL.js does.

@Cito, all makes sense. I'm suggesting that a developer should be allowed to choose what the graphql-core library get's tasked with, especially what gets sent back to the user. In other python web frameworks, the convention is that no error information reaches the user - and developers rely on this to avoid leaking sensitive data.

I'm advocating for a configuration option. Using this library with Django/Flask/etc. breaks the expectations of those frameworks - I'd like to see flexibility in the other direction (breaking the expectations of graphql-core) as well.

Thanks for the quick response. Cheers.

@jayhal But you do have full flexibility already. GraphQL-core is just a library that lets you build schemas and run queries against the schema, returning results and an errors list according to the spec - what you do with these results and errors is totally up to you or the server component you're using. You can raise the errors, or send them to the client (in development mode), or log them on the server with unique ids and send only the ids (in production) or do whatevery you want.

With other words, what you are talking about is something to be handled on levels higher up, the transport layer, e.g. in graphql-server-core, or even in flask-graphql or graphql-django, but not in GraphQL-core.

Btw, it's a misconception that no error information should reach the user. E.g. when creating a new entry in a database, the user should not get a "something broke" error message, but a more helpful one like "system currently overloaded, try again later" or "duplicate name" or "missing referenced entry". And in order to produce such error messages, the server needs to send a certain amount of error information to the client. Of course, in production it should only send high-level error descriptions, not the internal details or tracebacks.

Just read over @Cito's work in https://github.com/graphql-python/graphql-core/pull/240 and am curious if there's anything the community can do to help see this merged into master and released?

@ProjectCheshire I see you reviewed that pr already, are there more helpful folks like yourself kicking around on the GraphQL-Core team who can help this thing move along?

I think Cito's patch is now returning too much info to the client. I don't want them getting details about my database ids and constraints.

It seems like the right solution for my use case would be to write a graphene middleware. Graphene middleware can look at the response before it goes out. What I'd like to do is check for errors on the response, and if they're present, email myself a stack trace, then return a neutered error response to the client. Has anyone done something similar?

@reverie GraphQL-Core is at the lowest level. Filtering and dispatching of errors and stack traces should happen at higher levels like Graphene-Django, and depending on whether you're running in production or development mode. Regarding the error messages, note that GraphQL-Core-Next uses a special inspect() method instead of repr() for error messages which reveals less internal information and is more robust and secure and using less memory. E.g. a list of 1000 entries would not be fully serialized in the error message, but abbreviated. This is one of the recent improvements in GraphQL.js which has been ported to GraphQL-Core-Next. Graphene 3 will use GraphQL-Core-Next and get these more robust error messages.

@reverie agreed. The behavior of all errors being serialized and passed to the client is an undesirable default.

It breaks other except statements in higher-level libraries as well and requires dedicated error handling since exceptions get added to the response as an attribute.

I'd really like to see a way to select what errors are handled in the on_rejected function, otherwise don't catch them at such a low level and let exception handling continue.

I think I've finally cracked this. The solution looks like the one in this blog post: https://medium.com/@martin.samami/make-graphene-django-and-sentry-exceptions-work-together-f796be60a901

Like Cito said, what you want is to create a custom GraphQLView. In my case, we're first logging the exception, then, for security reasons, removing all exception details from response to the client.

Closing this since current development happens in version 3 and this is the repository vor version 2.