This is a PostgreSQL schema loader for the data provided by Texas Ethics Commission.
We provide a utility to
- extract schema from PDSERF/Plus format export by reading the ReadMe files and parse them to determine the schema, types, and keys and constraints.
- create the tables needed to load up the 1295 certs -- these are hand written from the pdf documentation by provided by TEC
- load the data up from csv format into the tables we create
Internally, lines from PDSERF readme are either,
- Table Description rows
- Column Description-cotd rows
- Column rows
- Start-rows for table (Start with "Record #:")
- End-rows for table (Containing just a
-
)
Column lines are either
- Indented as part of a group (array) replicated a certain amount of times
- Derived from a "single line"
All data is loaded up into PostgreSQL, including the Descriptions which we pull
down as
COMMENTS
.
You can find the readmes from the Texas Ethics Commission added in this project here:
This module loads has full coverage of the meta-data, and data of the TEC.
- Lobby Reports (tables
l_*
)tec.l_awardmementodata tec.l_foodbeveragedata tec.l_coversheetladata tec.l_giftdata tec.l_docketdata tec.l_individualreportingdata tec.l_entertainmentdata tec.l_subjectmatterdata tec.l_eventdata tec.l_transportationdata
- Campaign Finance Reports (tables
c_*
)tec.c_assetdata tec.c_creditdata tec.c_finaldata tec.c_candidatedata tec.c_debtdata tec.c_loandata tec.c_contributiondata tec.c_expendcategory tec.c_pledgedata tec.c_coversheet1data tec.c_expenddata tec.c_spacdata tec.c_coversheet2data tec.c_expendrepayment tec.c_traveldata tec.c_coversheet3data tec.c_filerdata
- 1295 Certs
tec.form1295_box123 tec.form1295_interested_party
Requirements: PostgreSQL, git, curl
Repo download and database setup (example in bash):
$ git clone https://github.com/EvanCarroll/db-Texas-Ethics-Commission.git
$ cd ./db-Texas-Ethics-Commission
$ make
$ createdb mydb
$ psql -d mydb -f ./runme.sql 2>&1 | tee out.log
$ make clean
Created at Houston Hackathon 2018 as the sole work of Evan Carroll.
If you use this, open source all (100%) of your stuff, or I'll litigate. The GPL is not the AGPL. Please read, and be advised:
GNU Affero General Public License v3, see included LICENSE.md
Contact Evan Carroll 281.901.0011 for a quote on development.