posthdf
is a Stata module
that posts scalars, macros and matrices in e()
using estimation results stored in HDF5 files.
posthdf
makes it easy to load estimation results
obtained from any statistical software into Stata
for producing tables and plots.
-
Use the widely supported HDF5 file format to collect results generated from virtually any tool.
-
Load groups of results into Stata with a simple one-line command.
-
Access results in Stata with
estimates restore
and group names. -
Optionally specify and transform coefficient names with flexible customization.
The latest release is on the stable
branch of this repository
and can be obtained by running the following command in Stata:
net install posthdf, replace from(https://raw.githubusercontent.com/junyuan-chen/posthdf/stable/)
The master
branch contains the latest development
that might have not been tagged for release.
Note:
To check the version of the current installation of posthdf
,
run which posthdf
in Stata.
Since the functionality relies on Python integration introduced in Stata 16,
a Python installation is required (Python version 3.7 or above is recommended)
and it needs to be linked to Stata with
python set exec
.
The following Python packages are required:
If you do not already have a Python installation, a simple way to set up the Python environment is to install Miniconda, which provides Python and a package manager called conda.
Once estimation results are saved in an HDF5 file,
posting them in Stata is very simple with posthdf
.
An HDF5 file consists of groups
and datasets
.
Each collection of estimation results should be saved
in the same HDF5 group with each item being an HDF5 dataset named appropriately.
-
An HDF5
group
is just like a folder. Results saved in the same group will be posted in the same collection in Stata as if they were generated from an e-class command. The group name will also be used to save the results throughestimates store
. A group name that is invalid for being used as a Stata name will be converted automatically. -
An HDF5
dataset
holds the data as an array. Each object such as the coefficient vector, the variance-covariance matrix, the number of observations, etc., should be saved separately in its own HDF5 dataset. The name of each dataset will be used when the object is posted ase(name)
in Stata. -
Certain objects provide special information for
ereturn post
. It is best to name the datasets containing these objects using the default values listed in the help file (runhelp posthdf
in Stata) so that no further specification is needed to inform their locations.
To post all the groups of estimation results in Stata, simply run
posthdf using file_name
Each group of results are now saved under the group name (with any /
replaced by _
).
Note that posthdf
also provides the following features:
-
If an array of coefficient names are found, they will be used to specify the column names and row names for coefficient vector
b
and variance-covariance matrixV
. -
If the coefficient names indicate levels and interactions for
factor variables
andtime-series operators
such as leads and lags, they can be converted into a format that Stata understands using a parser specified by optionparser
. Users can provide their own parsers if desired.
Please see the help files for details on available options and more examples.
Contributors are welcome. If you find any bug or have suggestions for improvement, please open an issue or make a pull request.