opengeospatial / ogcapi-processes

Home Page:https://ogcapi.ogc.org/processes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

outputs seem to be ambiguous

m-mohr opened this issue · comments

I'm writing a web UI for OGC API - Processes and I'm struggling with the outputs of processes.

Some assumptions

  • An input for me is something where a user has a choice or can provide something to a process that usually changes the output ins some way. What the user can input is described in the schema.
  • An output for me is what a process gives back and what is returned is described by the schema.

I compare this with math where f(x) = x*2 and f(2) returns 4. Are these assumptions true for OGC API - Processes?

Looking at some examples in the wild:

This is confusing about them:

  1. In the RenderMap process there is an output and it defines that various media types can be returned. There's no parameter though that can influence the media type. The spec says content negotiation can be used, but the relation between content negotiation and the return type is sometimes there and sometimes isn't (see echo).
  2. In the echo process it's not clear whether the process returns 3 values or whether the it's one output that can be any of the given output schemes. I'd assume it's one output with a choice of 3 types in RenderMap and 3 outputs wach with one type in echo. But then I'm also wondering, if I execute echo synchronously, how do I get the 3 types back in one response?

Overall, it looks like for RenderMap I need to render an additional select field for the file format in the UI, but for the echo process I don't. How can I distinguish that?

In the web UI I currently render an input for each output.
That works for Render Map:
grafik

But not very well for echo (which partially might also be a limitation in the UI):
grafik

I compare this with math where f(x) = x*2 and f(2) returns 4. Are these assumptions true for OGC API - Processes?

Yes they are. Inputs and outputs are very much like the arguments and return value of a function, respectively.

The spec says content negotiation can be used, but the relation between content negotiation and the return type is sometimes there and sometimes isn't (see echo).

In version 1.0, there is no header-based content negotiation -- that is the main topic of #217 that triggers the discussion of a 1.1 version to address this. Output format must be selected when executing the process with the "format" key as part of the "outputs" section of the execution request. It is possible for an output to only support have a single format. Is that what you mean?

In the echo process it's not clear whether the process returns 3 values or whether the it's one output that can be any of the given output schemes.

This is related to issue #60. The way I understand the EchoProcess (we have our own Echo process that is also certified passing the conformance test) in the way that the ETS should expect it is that all outputs implemented by the server should always be returned, regardless of what inputs are specified. The ETS currently does not work this way as I documented in opengeospatial/ets-ogcapi-processes10#60. This assumes that it requires an EchoProcess - I would much prefer if it were able to test useful processes by putting together execution request based on the process description and/or an example request). The inputs are NOT a way to select which outputs the client would like to retrieve back. That is the purpose of the "outputs" section of the execution request, but there is some confusion in this regard for the ETS, which seems to expect that will only get an output for what it specifies an input, and this expectation might also apply to the GeoLabs implementation. Is that what you refer to?

The way I usually handle this is, if the process output has only 1 supported format, there is no need for any output format selection when submitting the execution request (though it could still be specified explicitly). If there are more than 1 supported format, then the desired one should be provided by this:

format:
$ref: "format.yaml"

In the event that no format is specified in the execution request, I employ the one that was mentionned as default in the process description schema of the output. If no default is indicated in the process description, and none were provided in the execution request, the execution fails with a "missing required output format" just like if a required input was missing.

Note that if you are using 1.0 specification, the outputs can also indicate transmissionMode = value | reference (https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/schemas/output.yaml) when submitting the execution request. This is what controlled explicitly how echo's output:b should be returned, from a plain string {"value": "<data>"} or a reference link {"href": "<url>"}. That could be another selector in the form for outputs.

This value/reference choice is however expected for all outputs with 1.1/2.0, and is instead controlled using the Prefer: return= minimal | representation header. However, IMO, this newer specification uses a much less convenient approach than 1.0 did because one cannot provide different return preferences simultaneously for distinct outputs at execution time. They need to be requested one by one after execution using the /jobs/{job-id}/results/{outputID} endpoint. Therefore, your web form would only be able to request everything by-value or everything by-reference, but not both simultaneously. Also, the server is technically allowed to ignore Prefer (as per that HTTP header specification), and return by value/reference as it deems more appropriate, so something else to watch for...

@fmigneault

If no default is indicated in the process description, and none were provided in the execution request, the execution fails with a "missing required output format" just like if a required input was missing.

That is not correct. Requirement 51 C states:

The first sub-schema in the oneOf array SHALL be considered the default format.

Regarding:

Therefore, your web form would only be able to request everything by-value or everything by-reference, but not both simultaneously.

1.1/2.0 introduces the separate /results/{resultId} end-points that obsoletes the by reference / by value anyways.
Client can directly access the single output they're interested in one at a time, avoiding the whole reference vs. value issue altogether, since when accessing a single output, that is always by value.

If clients still want to use the all results together from /results negotiating application/json, the Prefer: return header (or omitting it) allows letting the server decide what is best. The server likely has a much better idea than the client about the content that is going to be returned so it is actually in a better place to make that decision.

The 1.1/2.0 preference allows to (I have not verified whether it is stated correctly the latest draft) return either:

  • (no return preference) let the server decide, with a recommendation that large outputs (anything that would require base64-encoding i.e., all binary files, as well as all large JSON objects like feature collections or features with large geometry) are returned by reference, everything else is by value
  • representation: return everything by value, so that the response is self-contained
  • minimal: return everything by reference except simple types (e.g., strings, numbers, booleans, arrays of strings/numbers/booleans with 0-10 elements, single-level (no object or array member) objects with 0-4 elements)

In any case, clients need to be prepared to receive each output either by value or by reference if using the /results application/json with 1.1/2.0 (as per how the Prefer: return header works). If they don't want to handle that, they can simply always use /results/{resultId} which will always be by value.

The "reference" approach would normally link to the new /results/{resultId} end-point when linking to outputs by reference.

cc. @pvretano is that in agreement with the latest?
I would suggest to also change the terminology mixed type input. That is misleading. Perhaps alternative or multiple supported types instead?

multiple supported formats or multiple supported types would make sense.
It would align with the predecesor WPS terminology that used:

<Supported>
  <Format>
    <MimeType>...</MimeType>
  </Format>
</Supported>

13-NOV-2023: @pvretano to review issue and prepare a PR. Need to emphasize again, that the schema in input and output description is the schema for a single instance of the input/output. In Table 11 and Table 12, for N outputs need to put a star by the applicaiton/json media type and mention that, if the server supports it, the client can negotiate output packaged output formats (e.g. ZIP and multi-part mime). This needs to be added for backward compatability.