18F / omb-eregs

A tool to find, read, and maintain White House Office of Management and Budget (OMB) policy requirements

Home Page:https://policy-beta.cio.gov/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PUT API error responses don't provide context about node/content context

toolness opened this issue · comments

Our error responses now give enough details to technically pinpoint the location of an error, but it still takes some work to actually figure out what the error is. For instance, leaving the href out of an external_link results in the following error message from the AKN XML editor as of #971:

In an element starting at line 4:
* href - This field is required.

On its own, this isn't very helpful because we lack the context of the type of element that lacks the href attribute. Hopefully, we can visit line 4 to figure out what that is--but what if there are multiple elements on that line?

It would make the error easier to understand if we provided a bit more context about where the error occurred, without forcing the client to consult their own document to figure it out. One easy way to accomplish this might be by providing a bit more metadata about the node_type or content_type that the error came from, which would allow the above error to mention that it's an <external_link> element.

There are two fairly straightforward ways I can think of providing this information:

  • At the XML parsing level. Like _sourceline, we could populate the objects we pass to the deserializer with a _sourcetag property that contains the name of the tag that the information comes from. An advantage here is that if we ever deviate from our mapping of tag names to node/content type, this won't be a problem, because we'll always hang on to the actual tag name that the information came from. Also, we won't have to validate very much, since we know from lxml that the tag name is a non-empty string.

  • At the deserialization level. At the point that we populate an error dict with _sourceline, we could also populate it with e.g. _node_type/_content_type depending on the type of data it is. The advantage is that this information would populate error responses for JSON payloads as well as XML payloads. It'd take a bit more validation, though, because we'd have to test to make sure that node_type and content_type are actually present, and strings.

@cmc333333 do you have a preference which approach we use? Or is there a better approach I haven't considered?

I think most of this should wait on #907, the effort to implement a document schema validation for nodes. There, we'll need to come up with a way to describe the location of validation errors in a general language that applies to both the XML and ProseMirror editors. I suspect that same method will apply to the tests around footnote uniqueness.

"Content" elements are a little more rigid due to their distinct database models, but that rigidity (in the form of separate serializers) might give us a path forward. Do we have access to the serializer when creating the error? If so, we can add its CONTENT_TYPE field to the error message.