N0wwa / pystata

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pystata

    1. Save python data to csv
    1. Use python to write stata do file following regression specifications
    1. Report regression results and re-read it into python
    1. You need to have Stata installed and Stata license
    1. Install Stata library if you are estimating fixed effects (enter these command in Stata terminal)
    • ssc install reghdfe, ftools, esout

Specified the Stata path so that Python can find it

Please specified the Stata path in the config.ini. Note that do not put GUI Stata path here

To do list

  • update for Stata17 pystata API load pd.DataFrame function to boost up run-time speed
  • adept new pandas
  • replace Stata run function (call in terminal) with Pystata default run function
  • instrumental variable
  • consolidated winsorization method
  • iPython magic method
  • report R-squared when estimating OLS
  • SM and LM, rebuild

Example (See example.ipynb for more details)

from src.pystata import summary_col

# some random combinations of fixed effects
fx_1 = {'Stock fixed effects': 'fx1', 'Year fixed effects': 'fx2'}
fx_2 = {'Stock fixed effects': 'fx1', 'Industry Fixed effects': 'fx3'}
fx_3 = {'Stock fixed effects': 'fx1', 'Year fixed effects': 'fx2', 'Industry Fixed effects': 'fx3'}
# Syntax: [data, regression specification, covariance type (enter cluster list),fixed effects]
reg_inputs = [[data, 'Y  ~ 1  + x1+ x2', 'covariance type', {Fixed Effects}],  # This is an example (column 1)
[data1, 'y  ~ 1  + x1+ x2 ', 'robust', fx_1],  # (column 2) 
[data2, 'y  ~ 1  + x1+ x2 + x3 + x4', 'robust', fx_2],  # (column 3)
[data2, 'y  ~ 1  + x1+ x2 + x3 ', 'robust', fx_2],  # (column 4)
[data1, 'y  ~ 1  + x1+ x2 + x4', ['fx1', 'fx2'], fx_2],  # (column 5)
[data2, 'y  ~ 1  + x1+ x2 + x3 + x4', ['fx1', 'fx2'], fx_3]  # (column 6)
]
outputDir = '/home/user/pystata'  # set the directory to save Stata output (log and results)
table = summary_col(reg_inputs)  # read regression specification
table.set_dir(outputDir)  # set the directory to save Stata output (log and results)
table.name = 'table_pystata'  # set the name of the table
table.modelname = ["Y1", "Y1", "Var", "Variable", "Model name", "Y", ]  # set the name for columns
table.order = ['x1', 'x2', 'x3', 'x4']  # Determine independent variables order
table._main_()  # transit data from python to Stata and write Stata do file accordingly
table.run_do()  # run Stata do file

About

License:GNU Affero General Public License v3.0


Languages

Language:Python 70.9%Language:Jupyter Notebook 29.1%