seekinginfiniteloop / fedcal

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Overhaul Front-End to fully use Pandas extensions API accessors and ExtensionArray

seekinginfiniteloop opened this issue · comments

As it stands, I used a pretty slick (in my opinion) metaclass to automagically delegate functionality for our Datetimeindex and Timestamp-like classes (FedIndex and FedStamp) to their attribute pandas' objects. This provided pretty seamless functionality, and we will likely still need it for point queries for FedStamp/Timestamp, but in line with our goal to be neatly integrated into pandas, it's better to align ourselves with the extension API wherever possible. This will also allow us to integrate easily into Series and DataFrame on top of DatetimeIndex. The plan looks like this:

  • Build a mix-in to handle Fedcal attributes for pandas objects. Use the series, index, and dataframe accessor APIs to feed this mixin (presumably will require some object-specific customization on top of the mixin) into the pandas ecosystem. Pirate and reverse (that's a long 'arrrr' in reverse) engineer pandas' internal functions as needed to make it as pandas-like as possible.

  • Further integrate fedcal into Timestamp using internal Timestamp mechanics to the extent possible without subclassing (we tried that... it didn't play as nice as we needed it to... probably because of the heavy Cython backend to Timestamp and pydatetime - maybe one day we can do a Cython implementation to make it clean). Most likely we'll continue to use a refined version of the MagicDelegator metaclass for this, which I also plan to spin off into its own library at some point because it's really handy.

  • Figure out how best to serve up the appropriations status data and integrate functionality. I'm leaning towards a custom ExtensionArray and dtype(s) that use custom department and status objects for rich functionality. I haven't figured out what this looks like yet. Please send suggestions. In the meantime, _status_factory.py's fetch_index can deliver a functional multiindex with the data.

Once all that is done, we'll have our baseline core functionality and it'll be time to develop robust tests, beta test, and then join the pydata extensions community.