How protect an individual process hosted with OGC API Processes?

Question

How protect an individual process hosted with OGC API Processes?

securedimensions opened this issue a month ago · comments

Overview

An individual process hosted in a deployment of OGC API Processes shall be protected with an OpenAPI scheme, so for example API-Key. The client must send the HTTP header X-API-Key that is to be processed by the individual process.

Note

It is not the requirement to protect the entire OGC API Processes deployment.

Question

How would I describe the requirement of an API-Key header in the Process description? At the moment, there seem to be no option to describe the requirement for a particular HTTP header like X-API-Key according to the current Standard.

Tom Kralidis · Answer 1 · Thu Jun 20 2024 00:25:48 GMT+0800 (China Standard Time)

cc @pvretano @fmigneault

It depends on whether this information should be in the OpenAPI document, the process description (i.e. /processes/{processId}, or both.

OpenAPI security objects can be defined at multiple levels. In the use case above, an OpenAPI document can define a security object. The example below defines an API key requirement for process execution via only POST (see the security object):

    "/processes/pygeometa-metadata-validate/execution": {
      "post": {
        "security": {
          "my_api_key": {
            "type": "apiKey",
            "name": "api_key",
            "in": "header"
          }
        },
        "description": "Validate metadata from a pygeometa metadata control file (MCF)",
        "operationId": "executePygeometa-metadata-validateJob",
        "requestBody": {
          "content": {
            "application/json": {
              "example": {
                "inputs": {
                  "mcf": {
                    "mcf": {
                      "version": "1.0"
                    }
                  }
                }
              },
              "schema": {
                "$ref": "https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/schemas/execute.yaml"
              }
            }
          },
          "description": "Mandatory execute request JSON",
          "required": true
        },
        "responses": {
          "200": {
            "$ref": "#/components/responses/200"
          },
          "201": {
            "$ref": "https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/responses/ExecuteAsync.yaml"
          },
          "404": {
            "$ref": "https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/responses/NotFound.yaml"
          },
          "500": {
            "$ref": "https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/responses/ServerError.yaml"
          },
          "default": {
            "$ref": "#/components/responses/default"
          }
        },
        "summary": "Process pygeometa metadata control file (MCF) validation execution",
        "tags": [
          "pygeometa-metadata-validate"
        ]
      }
    },

@securedimensions would this work? Or are you thinking to describe security in /processes/{processId} itself?

Given this is natively supported in OpenAPI by design, at multiple levels (from server wide to path/operation specific), I would prefer delegating this to OpenAPI.

Jerome St-Louis · Answer 2 · Thu Jun 20 2024 01:40:09 GMT+0800 (China Standard Time)

Ideally, it should be possible to see that these request headers are needed in the process description, so that OGC API - Processes clients that do not parse OpenAPI definition can easily find the information.

Also, if using a {processId} parameter rather than distinct paths in the one API definition, then it is not possible to describe the need for different request headers in each.

Note that there's also a proposal to define an OpenAPI process description, which would fit very well with this (but does not remove the need to support this in a process description).

For Part 3, there is also the need to pass these authorization headers in execution requests across workflows to nested processes as @gfenoy highlighted before.

I would suggest that we consider this authentication header capability in process description for 1.1/2.0 or a future extension.

Francis Charette-Migneault · Answer 3 · Thu Jun 20 2024 01:40:44 GMT+0800 (China Standard Time)

I believe that when the HTTP request is sent to /processes/{processId} (or any other endpoint), and that a required HTTP authentication method is not provided, the response should include a WWW-Authenticate header with relevant challenges to access the process.

The JSON body of the process description does not need to include anything. Using security in OpenAPI implies that this header is used, as described by https://swagger.io/docs/specification/authentication/, https://www.iana.org/assignments/http-authschemes/http-authschemes.xhtml and all underlying RFCs.

Andreas Matheus · Answer 4 · Thu Jun 20 2024 05:33:04 GMT+0800 (China Standard Time)

@tomkralidis Any process deployed is described via a HTML page or JSON response. Here is an example of a process that requires an API-Key to get executed: Trusted Tea Pot Process. In that response, I can select describe process as JSON. Now, in that response, I would suspect a definition that this particular service needs an X-API-Key header. But I could not find any option in the Standard to do that. And, I also cannot figure out - maybe caused by lack of knowledge - how to actually fetch the OpenAPI description for this one service which could include the security constraint.

Hope you understand.

What am I missing?

Tom Kralidis · Answer 5 · Thu Jun 20 2024 05:52:33 GMT+0800 (China Standard Time)

@securedimensions this then becomes an addition to the process description response. OpenAPI already has a mechanism to express this. I guess we could consider re-using OpenAPI security object at the process description level as an option,

Francis Charette-Migneault · Answer 6 · Thu Jun 20 2024 07:47:04 GMT+0800 (China Standard Time)

@securedimensions
When you send the request to obtain that process description, if the operation is unauthorized, your response should include the WWW-Authenticate header as described here (see under section "401 Response"):

https://swagger.io/docs/specification/authentication/api-keys/

That example also shows how the corresponding token header is specified on the OpenAPI interface.

The fact that the process description is requested is in itself the way to obtain that information about the applicable tokens for that endpoint.

Gérald Fenoy · Answer 7 · Fri Jun 21 2024 01:30:33 GMT+0800 (China Standard Time)

If I understand correctly, the goal is to secure access to the execution endpoint for a given set of processes. I agree about using the OpenAPI security section directly in such a situation.

If we don't consider the Part 2 draft specification, this solution is an option. If we do, we will have a dynamic /api endpoint that changes over time so that the mutable processes can be considered (deployed, removed).

However, it seems important to address one question: how do the other endpoints function?

Let's consider the endpoint /jobs/{jobId} where {jobId} represents a job created using a process that requires the X-API-Key. Does this endpoint also necessitate the key to be accessed? My understanding is that it does (the same applies to /jobs/{jobId}/results and also to /jobs, why others should even be informed of a given jobId existence if they did not create it and cannot access its result?).

Also, how can the client application specify, at the deployment time, whether a process should be secured? If we add the security information in the process description, as @securedimensions asked and proposed by @jerstlouis and @tomkralidis, then this part is solved, by embedding the information in the process description of the OGC Application Package. So, I am in favor of adding this as well.

In the described use case, only the execute endpoint should be secured, but would it be possible to specify which paths should be secured for every deployed process? Indeed, we may think that accessing the /processes/{processId} would be accessible the same way (using the X-API-Key) and /processes itself (which would then list accessible processes only).

Finally, imagine a use case where we support the Part 2 draft specification. We still want a secured execution endpoint for some processes (e.g., /processes/bob/execution) and a non-secured /api endpoint. This means that the /api should include all the endpoints associated with every process (potentially stored in a dedicated namespace associated with the authenticated user or its group) and expose them in a unique and single place (the OpenAPI exposed by the Server Instance). Maybe having a dedicated OpenAPI per namespace would make sense, meaning having a secured /api endpoint listing only accessible endpoints. But it leads again to a dynamic OpenAPI definition. So, the idea of an OpenAPI exposed per process is back and should be investigated, but again should this /api endpoint be secured?

In the following prototype Server Instance, the goal was to have a dedicated namespace per user able to access only the resources he created, we handled the security by securing the following paths (meaning all endpoints defined whichever the request method):

/processes
/processes/{processId}
/processes/{processId}/execution
/jobs
/jobs/{jobId}
/jobs/{jobId}/results

As you can deploy processes (using POST on /processes), we decided security is required at this endpoint to ensure you cannot deploy processes if you are not allowed to do so. As from this endpoint, you can create new processes, we thought it would make sense also to secure access to the /processes/{processId} and /processes/{processId}/execution paths. As this last endpoint (POST method) is secured and we can create new jobs, we applied the same security scheme to the other /jobs* endpoints, ensuring only jobs created by a given user are accessible to him.

The current security implementation limits the visibility of processes to other users. If we add this security definition to the process description (GET on /processes/{processId}), it would help us overcome this limitation.

Francis Charette-Migneault · Answer 8 · Fri Jun 21 2024 08:36:35 GMT+0800 (China Standard Time)

I think a process deployment should NOT include anything to do with authentication, since it might not even be the role of the OGC API - Processes service to be the policy enforcement point. It is in fact much more common to have a separate service (IAM for example on AWS, or Keycloak as seen in previous OGC efforts) to do all authentication/authorization handling. Adding/modifying roles and permissions for an API operation/resource should be entirely separate requests toward those services dedicated for the complex auth operations. This is also how we handle this problem on CRIM's servers, and every endpoint can be controlled for each user/group/path combination without any impact on the OGC API - Processes implementation (or any other service for that matter). Every process and job dynamically created are also protected this way on a per-user basis for our servers, by dynamically creating the auth-roles for the dynamically created resources.

I strongly believe that everything is already in place in HTTP to indicate if an authentication detail must be provided. Instead of requesting the OpenAPI, parse it all, and try to detect a specific auth scheme on a given endpoint, one can simply send an HTTP HEAD/GET request to the desired endpoint and receive the relevant 401+WWW-Authenticate header information relevant for that endpoint. Doing a GET /processes/{processID} makes sense as one can retrieve both the JSON process description and the HTTP auth details at the same time, with a single request (even better than separately requesting the OpenAPI as well). HEAD can be used if you want to only obtain the authentication details without the contents.

There are already over a dozen of separate standards for handling various auth schemes (see https://www.iana.org/assignments/http-authschemes/http-authschemes.xhtml). We should definitely avoid replicating that logic through custom definitions in OGC API - Processes. Note also that "patching" authentication in processes should need the same kind of patching across all OGC APIs to make them work together. Using native HTTP methodology, nothing needs to be adjusted.