Central job submission endpoint (also: `POST /processes/{processID}/execution` is not RESTful and is confusing for workflows)

Question

Central job submission endpoint (also: `POST /processes/{processID}/execution` is not RESTful and is confusing for workflows)

m-mohr opened this issue a month ago · comments

The standard says:

The standard specifies a processing interface to communicate over a RESTful protocol

In REST, everything should be centered around resources. The endpoint POST /processes/{processID}/execution is not a resource and POSTing to it, should create e.g. /processes/{processID}/execution/{executionId}, but it doesn't. Instead it may for example create /jobs/{jobId} (in async) (or return a result directly in sync).

To create a job, asynchronous processing requests should be sent to POST /jobs. This would also remove the issue that there is this weirdness that for workflows you need to pick a "main process" to work underneath.

Singular processes could also be sent there with a workflow that just consists of a single processing node.
if you just send async requests to the endpoint issues with the Prefer header would also be solved: #413

Synchronous processing for a single process could still be sent to POST /processes/{processID}/execution but it would be weird for a workflow to be sent to that endpoint, too. So maybe it should be a separate top-level endpoint?

PS: This came up in some discussions with @fmigneault and @aljacob recently so posting it here for discussion.

Francis Charette-Migneault · Answer 1 · Sat Jun 22 2024 09:12:43 GMT+0800 (China Standard Time)

I will repeat my answer during the Testbed's meeting just for the sake of sharing with everyone openly.

POST /processes/{processID}/execution was introduced (to my understanding), because an OAP implementation is allowed to omit the creation of a job. This is relevant, notably, if the OAP decides to only support sync execution mode, where a job resource is not necessary (though it could still create one for later reference if desired), since the results are obtained directly.

Given that no job would be created in this case (which is technically considered the default/minimal requirement of OAP), the inverse non-RESTful argument arises if POST /processes/{processID}/jobs was used, since no job entity is created and 200 is returned. The way to avoid this ambiguity in REST is usually to replace the term by an action/verb, hence the execution (arguably, a better choice could have been execute?), to indicate that an operation is "created" rather than a resource.

Note that I agree with /processes/{processID}/jobs being better, since my implementation supports both sync/async, and creates a job for reference in both cases anyway, but I understand the reasoning of the execution counterpart. Since it is not much overhead, and for the sake of backward compatibility, my server handles both paths interchangeably.

I think POST /jobs makes sense as well (especially for alignment with openEO and potentially submitting an ad-hoc Workflow). It makes sense to add a POST definition for this path since it is already available, and would (in the case of async at least) deal with corresponding resources. However, I think this does not resolve the RESTful convention issue in the case of sync that would still not require a job resource to be created.

I think sync/async and job creation implies opposite ways to think about it, and none will be satisfied with either approach. My preference is to reduce the number of endpoints doing the same thing, even if it might feel odd for one approach over the other. That being said, I would most probably support either way regardless for historical/compatibility reasons.

Jerome St-Louis · Answer 2 · Tue Jun 25 2024 01:23:28 GMT+0800 (China Standard Time)

If a particular "root" process really does not make sense for some workflow definition (although there are work arounds for that, like a simple process gathering outputs from multiple processes, whether as separate outputs, as an array of outputs, or as an actual combining operation like merging multiple bands into a single GeoTIFF), then we could probably agree on some other end-point where to execute a workflow in an ad-hoc manner. For pre-deployment of workflow, Part 2 should still be used (POST a new workflow definition at /processes to create a new process).

Whether using /jobs for this purpose makes it easier or harder for openEO integration probably depends on #420 discussion in terms of whether it conflicts with existing capabilities or ends up working exactly the same as current functionality.

Gérald Fenoy · Answer 3 · Tue Jul 09 2024 16:35:48 GMT+0800 (China Standard Time)

During the SWG meeting on 2024-07-08, I introduced the idea of defining a conformance class in the OGC API - Processes - Part 3: Workflows to add POST on /jobs with an execute request that would define a "workflow." When I say "workflow" here, I mean a JSON object that would conform to the execute-workflows.yaml schema, so a processes chain (execute request with a root process).

With Part 1, it stays the same:

POST /processes/{processId}/execution
execute request conform to execute.yaml

The response contains a JSON object conforming to statusInfo.yaml and a header with the location of the created job (/job/<jobid>).

With Part 3, you would be also able to use the following:

POST /jobs/
execute request conform to execute.yaml (adding a "process" attribute pointing to the process to execute)

Here, there are options for what happens.

The behavior can be the same as with the execute endpoint (POST /processes/{processId}/execution), and the execution starts right away. There is no real interest in adding such an end-point if it offers no capability other than the one defined in Part 1.
Another option would be to return a JSON conforming to statusInfo.yaml containing a new jobid and status=accepted. A Location header can also be included to point to the created /job/{jobId}. But no execution occurs at that time; you only ask for a job instantiation.

Using the second option, we may then imagine using POST on /jobs/{jobId}/execution to start the execution of the "prepared job" effectively (I was willing to use the same /jobs/{jobId}/results initially. Still, it conflicts with the currently available end-point). Then, the behavior remains the same as for a standard execution.

I think adding this modification in the Part 3 draft specification would help align OGC API - Processes with OpenEO.

If there is interest in adding this to the Part 3 draft, I volunteer to start the writing and work on PR for this addition for further discussion.

Francis Charette-Migneault · Answer 4 · Tue Jul 09 2024 23:36:22 GMT+0800 (China Standard Time)

@gfenoy

With Part 3, you would be also able to use the following:
POST /jobs/

Ideally, this should be handled in Part 1 as well.
There is no reason to limit this capability to workflows only.

[...] the execution starts right away. There is no real interest in adding such an end-point if it offers no capability other than the one defined in Part 1.

Even if the execution starts right away, there is an advantage. It allows the definition of a graph (rather than a chain) that does not have a single root process, which is a requirement with POST /processes/{processId}/execution from the reference processId.

The workaround is to define the process corresponding to that graph to invoke it as a root process, but this involves support of Part 2 to deploy it.

Another option would be to return a JSON conforming to statusInfo.yaml containing a new jobid and status=accepted. A Location header can also be included to point to the created /job/{jobId}. But no execution occurs at that time; you only ask for a job instantiation.

Note that status=accepted will still not behave like openEO. Although status=accepted is available, OAP does not require to POST again to "start" the job. Accepted only means that it was received. At the time the next GET status request is done, it could already be started, completed, or still in queue depending on server availability. In openEO's case, it will remain pending until the "start" request is sent.

I believe that a POST /jobs returning similar statuses to what POST /processes/{processId}/execution returns, but with different meaning regarding how the job is actually started will be confusing and cause inconsistent implementations. Maybe a distinct status=pending (or created, ...) should be considered to allow POST /jobs/{jobId}/execution strategy.

If this is the strategy that is taken, I would also like to have a way (query or body parameter?) to indicate whether the job should wait (status=pending) or is allowed to run immediately (status=accepted) to avoid having to do 2 requests each time if we want to submit the execution right away.

Gérald Fenoy · Answer 5 · Fri Jul 12 2024 23:43:41 GMT+0800 (China Standard Time)

Ideally, this should be handled in Part 1 as well. There is no reason to limit this capability to workflows only.

I agree with that point, but the schema that defines the process key is currently only available in the Part 3 draft specification. If we move it to the core, I don't see any objection.

[...] the execution starts right away. There is no real interest in adding such an end-point if it offers no capability other than the one defined in Part 1.

Even if the execution starts right away, there is an advantage. It allows the definition of a graph (rather than a chain) that does not have a single root process, which is a requirement with POST /processes/{processId}/execution from the reference processId.

The workaround is to define the process corresponding to that graph to invoke it as a root process, but this involves support of Part 2 to deploy it.

Another option would be to return a JSON conforming to statusInfo.yaml containing a new jobid and status=accepted. A Location header can also be included to point to the created /job/{jobId}. But no execution occurs at that time; you only ask for a job instantiation.

Note that status=accepted will still not behave like openEO. Although status=accepted is available, OAP does not require to POST again to "start" the job. Accepted only means that it was received. At the time the next GET status request is done, it could already be started, completed, or still in queue depending on server availability. In openEO's case, it will remain pending until the "start" request is sent.

I believe that a POST /jobs returning similar statuses to what POST /processes/{processId}/execution returns, but with different meaning regarding how the job is actually started will be confusing and cause inconsistent implementations. Maybe a distinct status=pending (or created, ...) should be considered to allow POST /jobs/{jobId}/execution strategy.

If this is the strategy that is taken, I would also like to have a way (query or body parameter?) to indicate whether the job should wait (status=pending) or is allowed to run immediately (status=accepted) to avoid having to do 2 requests each time if we want to submit the execution right away.

I support the new pending (prepared or created) and queued status ideas and the additional parameter for the client application to choose between the execution modes (waiting or run). I think that openeo uses different endpoints for both cases /results for executing immediately and /jobs/{jobId}/results to execute asynchronously. If we already define GET on /jobs/{jobId}/results to access the execution results, I don't have any objection to adding support for POST on this same path for executing prepared tasks and adding support for POST on /results with the same request body for synchronous execution, as we don't have anything defined for /results yet and we did not support the execute-workflow.yaml before Part 3 schema was made available, so there was no way to send a process chain. Still, I don't understand why the word "results" is used in this path as we discuss execution here.

I thought returning a statusInfo makes sense as it corresponds to how we get information regarding a job, and we POST on /jobs to create it. In this statusInfo, one link with rel=http://www.opengis.net/def/rel/ogc/1.0/execute would be pointing to /jobs/{jobId}/execution (in openeo, it is /jobs/{jobId}/results), and using POST on this path would create an entity like: /jobs/{jobId}/execution/{runId} (that can be reduced to /jobs/{jobId}). I don't think this runId is required. A job is traditionally a mutable entity, so I don't see any issue with reusing the same jobId to follow the execution progress.

To summarize, this would result in the addition of the following endpoints:

POST on /results => synchronous execution of a process chain (body: execute request execute-workflow.yaml)
POST on /jobs => statusInfo with status=created and Location header set to /jobs/{jobId} (same body as before)
GET on /jobs/{jobId} => statusInfo (up-to-date)
POST on /jobs/{jobId}/results => statusInfo with status=queued or status=running (empty body request, I think)

I would prefer to change from /results to /execution.

If there is interest in moving in that direction, I would be happy to volunteer to start a PR with the required updates in Part 1 or Part 3, depending on what we decide about the execute-workflow.yaml schema (moving it to core).

Panagiotis (Peter) A. Vretanos · Answer 6 · Sat Jul 13 2024 00:26:50 GMT+0800 (China Standard Time)

@gfenoy @fmigneau STOP! We were almost ready to submit Part 1 and Part 2 to the OAB for review and then RFC. Now we are proposing to change or add a completely NEW way to execute processes. Sorry but this is not something we can just throw into Part 1 and Part 3 without (a) a lot of dicussion in the SWG and (b) at least some testing in a code sprint or two.

Jerome St-Louis · Answer 7 · Sat Jul 13 2024 00:44:13 GMT+0800 (China Standard Time)

My preference would be to leave Part 1 as-is except for bug fixes / clarifications of ambiguity, maintaining what so far was a mostly backward compatible path from 1.0.

We can have the discussion about new end-points and/or new methods at existing end-points for Part 3: Workflows, since it's still at a relatively early stage compared to the Part 1 revision.

sptillma · Answer 8 · Sat Jul 13 2024 00:56:13 GMT+0800 (China Standard Time)

I would second @pvretano and @jerstlouis sentiment. We should continue with Part 1 as-is and address these suggestion in Part 3 or as a future item. I'm happy to have a conversation about it at the next SWG meeting - but I will need a lot of convincing at this point.

Francis Charette-Migneault · Answer 9 · Sat Jul 13 2024 02:27:50 GMT+0800 (China Standard Time)

@pvretano
I want to highlight that I am NOT proposing those changes to be integrated in Part 1 and Part 2 before their current state are accepted on their own. I want a prior Part 1/2 release as well!

I would even be fine to define an entirely separate "Part 4" for POST /jobs/{jobId} on its own, which could be reused by Part 1 Core Processes, Part 2 Deployed Processes and Part 3 Workflow Processes. I do see this as a new alternate capability that can build on top of existing items without breaking existing Part 1/2/3. It SHOULD NOT replace the Part 1 Job execution strategy. Users should be allowed to opt-in or not with this alternate execution method.

That being said, I think it is worthwhile to have these discussions early on since TB20 is working on it. It does mean it has to be integrated yet at all. However, not discussing these issues now will lead to bifurcating implementations, and more problems down the road when we try to "realign" them.

Gérald Fenoy · Answer 10 · Sat Jul 13 2024 05:13:40 GMT+0800 (China Standard Time)

@pvretano, @sptillma, @jerstlouis, @fmigneault, I agree with all of you. I'm as eager as you are to see the new releases from Part 1 and Part 2.

I want to clarify that initially, I only mentioned the integration within the Part 3 draft.

I am also open to @fmigneault's proposal to start a new Part 4 for OGC API - Processes.