IQSS / Amelia

Amelia: A Package for Missing Data

Home Page:http://gking.harvard.edu/amelia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tscsPlot() - Missing warning about initialization of random number generator and behavior or is it a bug.

LutzDE opened this issue · comments

Bug or missing warning:

"tscsPlot()
Plots a time series for a given variable in a given cross-section and provides confidence intervals for the imputed values."
^^Quote from R help

My first thought was: great "tscsPlot" plots my imputed values and there is no need for an individual ggplot().

Just because the function get's as an input: the output of the imputation process based on amelia().
Normaly i expect in such a situation, that multiple calling of the same function (tscsPlot) generates equal output.
That's not the case. The output is not only based on the amelia() output.
Internal functions of amelia() and the random numbers are involved too.
The question ist, what ist the information gain (if the values always change) or is there a bug?

Actually the same result is only possible, if the random number generator is set every time calling.

Missing:

First, a warning in the documentation about the behavior (random numbers).
Second, a warning in the documentation, that the (mean) output is not equal the imputed values of amelia().

Example Source:

set.seed(1234)
tcc<-amelia(africa,cs="country",ts="year")

set.seed(1234)
tscsPlot(output=tcc,cs="Cameroon",var="trade")
set.seed(4711)
tscsPlot(output=tcc,cs="Cameroon",var="trade")

(mean) imputed values are shifting

tscsPlot(output=tcc,cs="Cameroon",var="trade",ylim=c(40,60))
tscsPlot(output=tcc,cs="Cameroon",var="trade",ylim=c(40,60))
tscsPlot(output=tcc,cs="Cameroon",var="trade",ylim=c(40,60))
tscsPlot(output=tcc,cs="Cameroon",var="trade",ylim=c(40,60))
tscsPlot(output=tcc,cs="Cameroon",var="trade",ylim=c(40,60))
tscsPlot(output=tcc,cs="Cameroon",var="trade",ylim=c(40,60))

the draws argument supposedly changes the number of imputations used to generate the means and CIs for the plot, so setting this to anything greater than 5 in this example should make the plot reproducible. This seems like a bug to me, or a mistake in the documentation about what draws does.