stekhoven / missForest

missForest is a nonparametric, mixed-type imputation method for basically any type of data for the statistical software R.

Home Page:http://stat.ethz.ch/CRAN/web/packages/missForest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

imputation error

rkb965 opened this issue · comments

For the returned OOB error, is there any general guidance on when there is too much error to likely be useful? Even if data- and context-dependent, I do not know how to interpret the NMRSE and the PFC, and I would grateful appreciate any resources to guide decisions with regard to these measures.

The best way to better understand the error measures is to run missForest in supervised mode, i.e. provide xtrue and see what kind of errors appear. If your data contains missing values from the start, select a portion of it which is fully observed an generate some artificial missing values using prodNA() and use this as xmis while the fully observed part is used as xtrue. However, as you say - the error rates are context dependent.