EvanCarroll / db-Texas-Ethics-Commission

A schema loader for the Texas Ethics Commission

Home Page:https://www.ethics.state.tx.us/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Texas Ethics Commission - Schema Loader

This is a PostgreSQL schema loader for the data provided by Texas Ethics Commission.

We provide a utility to

  • extract schema from PDSERF/Plus format export by reading the ReadMe files and parse them to determine the schema, types, and keys and constraints.
  • create the tables needed to load up the 1295 certs -- these are hand written from the pdf documentation by provided by TEC
  • load the data up from csv format into the tables we create

Internally, lines from PDSERF readme are either,

  • Table Description rows
  • Column Description-cotd rows
  • Column rows
  • Start-rows for table (Start with "Record #:")
  • End-rows for table (Containing just a -)

Column lines are either

  • Indented as part of a group (array) replicated a certain amount of times
  • Derived from a "single line"

All data is loaded up into PostgreSQL, including the Descriptions which we pull down as COMMENTS.

You can find the readmes from the Texas Ethics Commission added in this project here:

Coverage

This module loads has full coverage of the meta-data, and data of the TEC.

  • Lobby Reports (tables l_*)
    tec.l_awardmementodata         tec.l_foodbeveragedata
    tec.l_coversheetladata         tec.l_giftdata
    tec.l_docketdata               tec.l_individualreportingdata
    tec.l_entertainmentdata        tec.l_subjectmatterdata
    tec.l_eventdata                tec.l_transportationdata
    
  • Campaign Finance Reports (tables c_*)
    tec.c_assetdata         tec.c_creditdata        tec.c_finaldata
    tec.c_candidatedata     tec.c_debtdata          tec.c_loandata
    tec.c_contributiondata  tec.c_expendcategory    tec.c_pledgedata
    tec.c_coversheet1data   tec.c_expenddata        tec.c_spacdata
    tec.c_coversheet2data   tec.c_expendrepayment   tec.c_traveldata
    tec.c_coversheet3data   tec.c_filerdata
    
  • 1295 Certs
    tec.form1295_box123            tec.form1295_interested_party
    

Links

Installation

Requirements: PostgreSQL, git, curl

Repo download and database setup (example in bash):

$ git clone https://github.com/EvanCarroll/db-Texas-Ethics-Commission.git
$ cd ./db-Texas-Ethics-Commission
$ make
$ createdb mydb
$ psql -d mydb -f ./runme.sql 2>&1 | tee out.log
$ make clean

Background

Created at Houston Hackathon 2018 as the sole work of Evan Carroll.

License

If you use this, open source all (100%) of your stuff, or I'll litigate. The GPL is not the AGPL. Please read, and be advised:

GNU Affero General Public License v3, see included LICENSE.md

Contact

Contact Evan Carroll 281.901.0011 for a quote on development.

About

A schema loader for the Texas Ethics Commission

https://www.ethics.state.tx.us/

License:GNU Affero General Public License v3.0


Languages

Language:Perl 62.3%Language:PLpgSQL 30.4%Language:Makefile 7.3%