opengeospatial / ogcapi-processes

Home Page:https://ogcapi.ogc.org/processes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Processing in OGC API

cportele opened this issue · comments

Overview

Since June the progress of the draft of the WPS REST profile / OGC API Processes has been slowed down due to a discussion how processing resources should be best published and described in a Web API. One option is represented in the current draft which is a straightforward mapping of WPS 2.0 to a generic Web API. An alternative approach that has been proposed is to specify API building blocks for (geospatial) processing in a Web API. In general, both approaches could co-exist, too.

At the WPS/Workflow meeting this week I offered to summarize the alternative approach in order to have something more concrete to discuss. This is what this issue is about.

The main ideas are:

  • There is no need for fixed path structures and keywords, processing resources can in principle be included at various locations in the path structure of the API.
  • There is no need for the resources 'process collection' (currently: /processes) and 'process description' (currently: /processes/{processId}), the 'jobs' (currently: /processes/{processId}/jobs) and 'job result' (currently: /processes/{processId}/jobs/{jobId}/results) resources are sufficient. If a separate 'job status' resource (currently: /processes/{processId}/jobs/{jobId}) is required, is up for discussion.
  • We can distinguish the following patterns for processing resources in a Web API that is described by OpenAPI:
    • Asynchronous execution: The input is in the payload of a POST request. The result is a 201 response with a Location header pointing to the new result resource.
    • Asynchronous execution, callback option: As above, but with a callback to a webhook after completion of the processing. The URI of the webhook must be part of the POST request, either in the payload or in a query parameter.
    • Synchronous execution, POST method: The input is in the payload of the request. The result is a 200 response with the output in the payload of the response.
    • Synchronous execution, GET method: The input is in query parameters or in path parameters of sub-resources; i.e., this option is mainly suited for simpler input.
  • An example for the first three patterns is the /routes resource in the Open Routing API Pilot that compute new routes. These patterns are non-exclusive and an API may support all of them (a GET request on the jobs resource will return the list of jobs). An example for the last pattern are the 'map' resources in the OGC API Maps draft that create map images. The input are parameters like the bounding box of the map or the style to use.
  • In all patterns the processing resource may receive all input from the client request or the processing resource may implicitly operate on other resources of the API (e.g. a feature collection of a dataset or a complete feature dataset, an elevation model, a data cube, etc.).

The following sections use OpenAPI snippets to explain the building blocks used in each pattern. For simplicity, error responses have been omitted.

Pattern: Asynchronous execution

Here is an example from the Open Routing API Pilot. The main variable beside the path is the schema of the request body of the 'jobs' resource (here: /routes) and the response of the 'job result' resource (here: /routes/{routeId}). In this case, the schemas are JSON schemas 'routeDefinition' and 'route' as part of the OpenAPI definition. It could also be an absolute URI pointing to an external schema resource.

  '/routes':
    get:
      ... get a list of routes
    post:
      summary: compute a route
      requestBody:
        description: The definition of the route to compute.
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/routeDefinition'
      responses:
        '201':
          description: The route has been created and the route is being computed.
          headers:
            Location:
              schema:
                type: string
              description: URI of the new resource.
  '/routes/{routeId}':
    get:
      summary: fetch a route
      parameters:
        - $ref: '#/components/parameters/routeId'
      responses:
        '200':
          content:
            application/geo+json:
              schema:
                $ref: '#/components/schemas/route'

Pattern: Asynchronous execution, callback option

This pattern is identical to the previous pattern and adds a callback to a webhook. In the example, the webhook URI is passed in a "subscriber" member of the request. The callback sends the GeoJSON route as specified in JSON schema 'route' (the 'job result').

  '/routes':
    post:
      summary: compute a route
      requestBody:
        ... see above
      responses:
        ... see above
      callbacks:
        calculationCompleted:
          '{$request.body#/subscriber}':
            post:
              requestBody:
                content:
                  application/geo+json:
                    schema:
                      $ref: '#/components/schemas/route'
              responses:
                '202':
                  description: Route received successfully

Synchronous execution, POST method

In a synchronuous execution, the result is directly returned in the response. There is no new resource created on the server. Again, the main variables in the pattern are the schemas of the input and output.

  '/routes':
    post:
      summary: compute a route
      requestBody:
        description: The definition of the route to compute.
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/routeDefinition'
      responses:
        '200':
          description: The response is the route.
          content:
            application/geo+json:
              schema:
                $ref: '#/components/schemas/route'

Synchronous execution, GET method

The following is a slightly modified/simplified example from the OGC API Maps draft. The input is provided in path and query parameters.

  '/collections/{collectionId}/map/{styleId}':
    get:
      summary: fetch a map from a collection
      parameters:
        - $ref: '#/components/parameters/collectionId'
        - $ref: '#/components/parameters/styleId'
        - $ref: '#/components/parameters/bbox'
        - $ref: '#/components/parameters/datetime'
        - $ref: '#/components/parameters/width'
        - $ref: '#/components/parameters/height'
        - $ref: '#/components/parameters/transparent'
        - $ref: '#/components/parameters/bgcolor'
      responses:
        '200':
          content:
            image/jpeg:
              schema:
                type: string
                format: binary
            image/png:
              schema:
                type: string
                format: binary
            image/svg+xml:
              schema:
                type: string

Describing the input to a process

The examples use JSON Schema to describe the inputs and outputs. With OpenAPI 3.1, the OpenAPI specification will support the latest JSON Schema draft 2019-09.

JSON Schema is now also a pretty complex specification supporting a lot of edge cases. Requiring the capability to parse all kinds of JSON schemas would put a heavy bar on generic processing clients. This could be avoided by specifying a conformance class that defines a JSON Schema profile that is easier to parse. This is similar to the approach in levels 0 and 1 of the GML Simple Feature profile. The current process description schemas could be a starting point to determine the scope of such a conformance class.

That is, existing WPS implementations would be well suited as backends to such OGC API resources - enabling the rapid development of processing resources that follow the OGC API guidance.

Conformance declaration

To support clients aware of the processing concepts we could extend the conformance declaration to explicitly declare the jobs resources supported by an API. Here is an example for discussion how this could look like:

{
  "conformsTo": [
    "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/core",
    "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/input-json-schema-simple",
    "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/oas31",
    "..."
  ],
  "http://www.opengis.net/spec/ogcapi-processes-1/1.0/conf/core": {
    "jobs": [
        {
            "path": "/routes",
            "patterns": [ "sync-post", "async-post-webhook" ]
        },
        {
            "path": "/collections/{collectionId}/maps/{styleId}",
            "patterns": [ "sync-get" ]
        }
    ]
  }
}

We could also add other aspects like the input/output schemas, but for now this has not been done as it duplicates information from the API definition.

Known open issues

  • The description above assumes that the status of an asynchronous processing job is reflected in the result resource. This is how it was done in the Open Routing API Pilot, but this may not be practical in other cases. On option could be to support a status resource with information about the progress, e.g., at /route/{routeId}/status, and accessing an unfinished or unsuccessful job could re-direct to that resource.
  • To support more complex notification mechanisms something else than webhooks will be required. For example, using AsyncAPI or maybe MQTT directly.
  • Another aspect that is currently outside of the scope of standard Web APIs described by OpenAPI are the use of WebSockets or HTTP/2.

Re: Known open issues ... we could use a hypermedia control rel="monitor" to monitor the status of a resource. This could be in the header or in a response body.

A lot has happened since this issue was created. We will close it. Please check the current spec and feel free to create new specific issues.