edvinskis / python_mice

Simulation study for evaluating different imputation methods

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can a Python package do what mice can?

Missing data frequently complicate data analysis. A robust technique for addressing missing data is multiple imputation. In R, multiple imputation is commonly implemented through the mice package which utilizes the multiple imputation by chained equations (MICE) algorithm. It solves the missing data problem iteratively on a variable-by-variable basis and can yield unbiased and confidence valid inferences under many missing data conditions. However, such a standard choice is not yet established for Python.

This repository contains code for a model-based simulation study that is used to evaluate different Python imputation methods under different missingness mechanisms and proportions to whether they can produce valid inferences. The Python imputation methods KNNImputer, IterativeImputer, miceforest and MIDASpy are considered.

About

Simulation study for evaluating different imputation methods

License:GNU General Public License v3.0


Languages

Language:R 100.0%