Go: `gDay` cannot be parsed as `time.Time`
BatmanAoD opened this issue · comments
Description
The XSD gDay
type is not a timestamp but a recurring calendar day. Valid values given as examples in RELAX NG by Eric van der Vlist are:
---01
, ---01Z
, ---01+02:00
, ---01-04:00
, ---15
, and ---31
To reproduce the issue:
Generate schema for xsd
file with gDay
type (such as test/xsd/base64.xsd
)
Describe the results you received:
The Go type used in the generated schema to deserialize gDay
elements is time.Time
Describe the results you expected:
The Go type used can be deserialized from a valid gDay
string, such as the examples above
I've been running into a related issue today. I have a generated struct with a pretty common
LastUpdated time.Time
But the problem is that the official doc for xsd:dateTime describes the format as inspired by iso 8601 while go unmarshals time.Time to rfc3339. Since the two formats overlap but aren't the same, I ran into an issue where I was not lucky enough to have the time parsing work. In my case, the parsing failed with an unsurprising:
parsing time "2021-09-24T12:04:09.69" as "2006-01-02T15:04:05Z07:00": cannot parse "" as "Z07:00"
One solution would be to define a Time
type alias and have that type define the xml marshaling / unmarshaling interfaces while supporting the various formats that satisfy the xsd:dateTime specification. This wouldn't work with the other datetime related types like gDay
, however. Here's an attempt without the full list of formats that should be included. I also included a quicky SetPreferredFormat
to the generated code to allow me to specify the format to use when marshaling but after getting this to work, I ended up with something that I'm not fully confident would work in 100% of the cases but worked well enough for me.
Here's what I have which could be generated statically when an xsd file have structs with Time fields:
var preferredTimeFormat = time.RFC3339
// SetPreferredXMLTimeFormat sets the preferred time format (see https://pkg.go.dev/time) for parsing time for this
// package's XML structs. This is not a great API but it's meant as a workaround for the inability to specify
// time format through the encoding/xml API. This function is meant to be invoked during initialization depending on an
// application's needs. Note that the preferred format is the first one tried when unmarshaling and the only one used
// when marshaling. If not set, the default is time.RFC3339.
func SetPreferredXMLTimeFormat(format string) {
preferredTimeFormat = format
}
// Time is a type for parsing time using the xsd date time format
type Time time.Time
func (t *Time) UnmarshalXML(d *xml.Decoder, start xml.StartElement) (err error) {
var value string
err = d.DecodeElement(&value, &start)
if err != nil {
return err
}
return t.parseXMLTime(value)
}
func (t *Time) UnmarshalXMLAttr(attr xml.Attr) (err error) {
return t.parseXMLTime(attr.Value)
}
func (t *Time) parseXMLTime(value string) (err error) {
formats := []string{preferredTimeFormat, time.RFC3339Nano, time.RFC3339, "2006-01-02T15:04:05.00", "2006-01-02T15:04:05.999999999"}
for _, f := range formats {
parsed, perr := time.Parse(f, value)
if perr == nil {
tValue := Time(parsed)
*t = tValue
return nil
} else {
err = perr
}
}
return err
}
func (t *Time) MarshalXML(e *xml.Encoder, start xml.StartElement) (err error) {
tm := time.Time(*t)
formatted := tm.Format(preferredTimeFormat)
return e.EncodeElement(formatted, start)
}
func (t *Time) MarshalXMLAttr(name xml.Name) (attr xml.Attr, err error) {
tm := time.Time(*t)
formatted := tm.Format(preferredTimeFormat)
return xml.Attr{Name: name, Value: formatted}, nil
}
// ExampleStruct is an example of a type with a datetime field. This would use the generated type alias Time
// which can then be converted to time.Time by the user
type ExampleStruct struct {
LastUpdateAttr Time `xml:"LastUpdate,attr,omitempty"`
}
All that said, my feeling is that it would be preferable to map datetime
, date
and the other time-related standard types to string
instead of time.Time
unless we feel confident we can implement all those types in a way that doesn't fail during xml unmarshaling. It would be nice to have something like ☝️ that we'd feel confident complies with the spec but it would require more testing than what I've done with the code above.
So my question to you @BatmanAoD and @xuri would be: should I create a PR that replaces all mappings to the golang type time.Time
with a string or should I have a PR that attempts to support datetime
with the implementation ☝️ but still replaces the time.Time
mappings for the other types to string
?
I think the only way to properly unmarshal these types to anything other than string
would be to define Go types implementing the Unmarshaler
interface. That sounds like a proliferation of nonstandard type aliases that probably don't belong in a library like this, so I think string
is preferable.
I think the only way to properly unmarshal these types to anything other than
string
would be to define Go types implementing theUnmarshaler
interface. That sounds like a proliferation of nonstandard type aliases that probably don't belong in a library like this, so I thinkstring
is preferable.
That's correct. Implementing the Marshaler
/ Unmarshaler
interfaces is what I had done in the code I included above and it's hard to have full confidence that this wouldn't result in annoying bugs (missing formats or a format resulting in valid parsing but that could lose precision in the parsed value). Having string fields seems like the better tradeoff even if this means users would have to do the time.Time
parsing as a step following their XML parsing.
Thanks for the input, @BatmanAoD . I'm preparing that PR now to switch to string
.
And here's the PR.