azavea / franklin

A STAC/OGC API Features Web Service

Home Page:https://azavea.github.io/franklin/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Several problems with STAC API Item Search (/search) datetime parameter validation

philvarner opened this issue · comments

Describe the bug

There are several RFC3339 datetime types that are not accepted by Franklin, and a few that are accepted but should not be.

  1. datetimes with non-Z timezone offsets, e.g., 1996-12-19T16:39:57-08:00
  2. datetime intervals with a blank start or end time, e.g. /1985-04-12T23:20:50.52Z or 1985-04-12T23:20:50.52Z/ (either empty string `` or .. is acceptable for the open ended start or end time)
  3. datetime with a period after the seconds but no fractional seconds is accepted e.g. 1985-04-12T23:20:50.Z

All results:

Search with datetime=1996-12-19T16:39:57-00:00 returned status code 400
Search with datetime=1996-12-19T16:39:57+00:00 returned status code 400
Search with datetime=1996-12-19T16:39:57-08:00 returned status code 400
Search with datetime=1996-12-19T16:39:57+08:00 returned status code 400
Search with datetime=/1985-04-12T23:20:50.52Z returned status code 400
Search with datetime=1985-04-12T23:20:50.52Z/ returned status code 400
Search with datetime=1985-04-12T23:20:50.52+01:00/1986-04-12T23:20:50.52Z+01:00 returned status code 400
Search with datetime=1985-04-12T23:20:50.52-01:00/1986-04-12T23:20:50.52Z-01:00 returned status code 400
Search with datetime=1937-01-01T12:00:27.87+01:00 returned status code 400
Search with datetime=1937-01-01T12:00:27.8710+01:00 returned status code 400
Search with datetime=1937-01-01T12:00:27.8+01:00 returned status code 400
Search with datetime=2020-07-23T00:00:00.000+03:00 returned status code 400
Search with datetime=2020-07-23T00:00:00+03:00 returned status code 400
Search with datetime=1985-04-12T23:20:50.Z returned status code 200 instead of 400

Expected behavior

Valid RFC3339 datetimes are accepted, and invalid ones error with a 400 status code.

Additional context
Add any other context about the problem here.

Related: #729 😟

At a minimum, this should be consistent with stac4s. stac4s defines an RFC3339 formatter for parsing datetimes. Franklin doesn't use it.

Instead, there's a custom parser that relies on Instant.parse.

I don't know that that will fix all the problems, but the examples above should be a sufficient unit test.

Removed the comma-separated fractional seconds example, as that's not a valid RFC3339 string (only allowed in ISO8601)

@philvarner is it correct that every 400 above should have been a 200?

Some of these values I think should fail to parse. For example, what should a time parser make of 1986-04-12T23:20:50.52Z+01:00, which includes both a Z for UTC and an offset?

Some of these values I think should fail to parse. For example, what should a time parser make of 1986-04-12T23:20:50.52Z+01:00, which includes both a Z for UTC and an offset?

You're right, that was a typo in my test.

@philvarner is it correct that every 400 above should have been a 200?

The only one that was accepted that shouldn't have been was

Search with datetime=1985-04-12T23:20:50.Z returned status code 200 instead of 400

Instant.parse from java.time likes that one, and so does the rfc3339 formatter we wrote in stac4s 🤔 I'm not really sure how to navigate https://datatracker.ietf.org/doc/html/rfc3339 to figure out what excludes that format, do you have a link?

time-secfrac is optional, but if it does exist, it has to be a . and 1 or more digits.

 time-secfrac    = "." 1*DIGIT
   partial-time    = time-hour ":" time-minute ":" time-second
                     [time-secfrac]

from section 5.6.

I'm not sure we can reject those -- our formatter in stac4s isn't doing anything too weird, just taking the built in ISO local formatter and making offsets a bit more flexible (and to be clear Instant.parse also likes the weird string).

// A more flexible alternative to DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSS[xxx][xx][X]")
// https://tools.ietf.org/html/rfc3339
// Warning: This formatter is good only for parsing
val RFC3339formatter =
  new DateTimeFormatterBuilder()
    .append(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
    .optionalStart()
    .appendOffset("+HH:MM", "+00:00")
    .optionalEnd()
    .optionalStart()
    .appendOffset("+HHMM", "+0000")
    .optionalEnd()
    .optionalStart()
    .appendOffset("+HH", "Z")
    .optionalEnd()
    .toFormatter()

Some comparisons --

  • dateutil.parser.parse in python hates it:
In [3]: parse("2021-01-01T00:00:00.Z")
---------------------------------------------------------------------------
ParserError                               Traceback (most recent call last)
<ipython-input-3-88fed15e4415> in <module>
----> 1 parse("2021-01-01T00:00:00.Z")
...
  • java.time.Instant thinks it's cool:
@ Instant.parse("2021-01-01T00:00:00.Z") 
res1: Instant = 2021-01-01T00:00:00Z
  • javascript's Date hates it (node 14.17 for me):
> new Date("2021-01-01T00:00:00.Z")
Invalid Date
  • R (4.x for me) doesn't seem to have a bare parsing function that doesn't require a format (I tried strptime in base and parse_date_time in lubridate)
  • I don't know how to test dotnet / anything else that people use for STAC stuff

The good news is that this means other clients are really unlikely to produce dates in this format, since everyone but Java agrees it's no good. I'm inclined to leave this since parsing all the valid strings is more important for client interop than rejecting a format no one seems to want to create.

I'm inclined to make the acceptance of datetimes that are not RFC 3339 a warning rather than an error. We have other cases too, like cmr-stac (I think?) that accept dates and treats them like a range for an entire day.