Provide a way to order attributes in to_header (and other header-written methods)
jrising opened this issue · comments
The headers are meant to be read by both humans and computers, and while the order of attributes doesn't matter to a computer, it's super important to humans. I'm producing files with to_header look the following way:
dependencies:
- labor_global_interaction_2factor_BEST_14feb.nc4
contact: jrising@berkeley.edu
oneline: Yearly covariates by region and year
version: labor-texanus
author: James R.
But that's a lot less comprehensible than:
oneline: Yearly covariates by region and year
version: labor-texanus
author: James R.
contact: jrising@berkeley.edu
dependencies:
- labor_global_interaction_2factor_BEST_14feb.nc4
(it would also be nice to include comments and spaces in there, but ordering is a great start.)
Comments and spaces would be nice, but short of supporting writer subclassing I can't think of a good way to do this. I'm totally up for supporting writer subclassing so if you want to implement that feel free!
As for custom ordering - that's built in! Attrs, coords, and variables are backed by OrderedDict
s, so if you provide your arguments as ordered dicts (or if you assign them one attr at a time) they'll stay ordered:
>>> import metacsv
>>> import pandas as pd, numpy as np
>>> from collections import OrderedDict
>>>
>>> metacsv.DataFrame(
... np.random.random((5,4)),
... columns=list('ABCD'),
... attrs=OrderedDict([
... ('name', 'Michael'),
... ('alias', 'Mike'),
... ('email', 'mdelgado@email.com'),
... ('slack', 'mdelgado')])).to_csv('test.csv')
produces
---
name: Michael
alias: Mike
email: mdelgado@email.com
slack: mdelgado
...
,A,B,C,D
0,0.411039406175,0.214660699223,0.53100718563,0.356277186707
1,0.236111591008,0.987161424041,0.0122932717729,0.128662543434
2,0.481492991801,0.208113768508,0.71741322843,0.891704845376
3,0.55659016783,0.2560815563,0.587799422363,0.33716795811
4,0.564749039558,0.309904170341,0.616554499041,0.304823683105