ligfx / cps-csv

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1. Download jan13dd.rtf from http://www.nber.org/data/cps_basic.html
2. RTF is a nasty format! Mangle it into plaintext with rtf2txt.sh
3. Edit this plaintext by hand to separate the categories and remove the header / footer
4. Parse into JSON with jsonify.py
5. Download may13pub.dat from http://www.nber.org/data/cps_basic.html
6. Convert to CSV with dat2csv.py

More ideas:
http://www.nber.org/data/progs/cps-basic/cpsbjan13.do defines labels for the factors, and with that the range of acceptable values. It's a regular language (Stata .do), so relatively easy to parse.

About


Languages

Language:Python 92.0%Language:Shell 8.0%