sirupsen / airrecord

Ruby wrapper for Airtable, your personal database

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

type_cast regex too aggressive

kriskelly opened this issue · comments

I had a weird issue that I couldn't track down where Airrecord was converting certain fields into DateTime strings, and eventually I was able to workaround it by calling record.fields['My field name'] instead of record['My field name']. Now that I'm looking through the Airrecord::Table code, I see that there's a type_cast method that regex matches anything containing a date string.

return Time.parse(value + " UTC") if value =~ /\d{4}-\d{2}-\d{2}/

I'm pretty sure this is what was causing the bug, as the text of that particular field on that particular record contained a date string. Posting this here in case someone else runs into this. I'm not entirely sure how to work around this, as any given text field could contain a string matching any regex you might wind up using here.

Thanks for raising this. Excellent.

The problem is the way Airtable returns dates in the API. For example, I might have a table with a Learned On column:

image

That'll in the API look like this:

image

Unfortunate, to say the least. It'd be much nicer if the Airtable API would return it as a proper timestamp. However, I think this behaviour is somewhat counterinuitive. As we're in the process of preparing for 1.0.0, I'd propose the following for time-handling:

  1. Pass timeZone=UTC to all API requests. This didn't used to be something we could specify in the API, but we need to do that now. We already assume UTC.
  2. Disable parsing timestamps from strings that aren't in the 2018-03-04T17:06:28.000Z format by default. This effectively means tightening the regex for type_cast dramatically, or using Time.parse with whatever ISO that format is (this is e.g. what createdTime uses).
  3. Add createdTime by default to all records. This has been exposed in the API recently and is currently not included by default. Make it accessible through ["createdTime"] and .created_at, making it symmetric with .id.

As a follow up in 1.1, we can add a type "Column Name", Object which allows casting into an object. Preferably, we'd use a model that's compatible with ActiveModel::Type (but without dragging in the dependency). Since Airtable doesn't seem to expose the schema, we should allow manual typing. Later, when, or if, they expose the schema through the API—we can automatically import it. This'll ease the transition later to a SQL-like database if the user prefers.

@kriskelly are you interested in contributing 1-3 to Airrecord as part of the 1.0 release? ❤️

@chrisfrank @Meekohi WDYT about this?

I think I'm hearing two different issues here:


(i) @kriskelly, I think you're saying that you had a record like this...

{
  id: "someRecordID", 
  fields: {
    description: "once upon a time, on 2018-11-08, something happened"
  }
}

...and that calling record['description'] unexpectedly returned <Date: 2018-11-08>, not the description string. Is that right?


(ii) What I'm hearing from @sirupsen is that Airtable's plain date fields don't have a full iso8601 timestamp, so we don't know how to parse them in the correct time zone. Is that right?


I think we can solve the first problem with a very minor tweak to the casting Regex. Solving the second problem is more complicated, as @sirupsen outlined, so I have an alternate proposal that could solve both problems:

What if we stop trying to cast dates altogether, except maybe on the special createdTime field? Would we lose any features?

I often want to sort by timestamps, but one of the nice things about iso8601 timestamps is that no matter whether they're instances of String or Time, their sorting behavior is identical.

I guess we'd lose the ability to manipulate dates in views, e.g. record["Arbitrary Date Field"].strftime("%D") would no longer just work. But it seems okay, to me at least, to expect users to define date helpers as instance methods or elsewhere, e.g.

def arbitrary_date
  Date.iso8601(self['Arbitrary Date Field'])
end

@chrisfrank ah yes, I missed that there's no \A...\z on that regex. 🤦‍♂️

👍 to your proposal. In 1.1/1.2 we could consider support typing fields (through e.g. ActiveModel::Type) to make this easier.

@chrisfrank You're right, having a date in the middle of an arbitrary string was causing my problem. In this case, I'd cast my vote for not casting dates at all. Slightly less convenient for users but I think I'd rather have the raw data in pretty much any scenario.

I agree with @kriskelly -- automatic converting types only makes sense if it's correct 100% of the time or easy reversible, it seems like this might result in lost data potentially. If the dev knows for sure it is a date they can always convert themselves without too much hassle. Explicitly setting up a schema (or maybe just a few fields for explicit conversion) yourself seems like a reasonable option could be a future feature.

Fixed in #40 which will be released as part of 1.0 soon