IQSS / Amelia

Amelia: A Package for Missing Data

Home Page:http://gking.harvard.edu/amelia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Amelia Runs Indefinitely on Smallish Dataset

MEMDrexel opened this issue · comments

Hi,

I have a dataset of around 2000 rows by 50ish columns that belong to around 170 cross-sectional units for a population. My call looks like this:

df_test <-
  df[Cross.Sectional.ID %in% df[, pmax(response, na.rm = TRUE) > 100, by =
                                   Cross.Sectional.ID][V1 == FALSE, Cross.Sectional.ID],]
start_time <- Sys.time()
df_amelia <- amelia(setDF(df_test[, c(-1,-2)])
  , m = 1
  , p2s = 2
  , cs = "Cross.Sectional.ID"
  , ts = "Time_Unit"
  , ords = c("Ordinal.Variable.1", "Ordinal.Variable.2", "Ordinal.Variable.3")
)
end_time <- Sys.time()

Running this on my business laptop has been going for multiple days without completion. Oddly, R doesn't seem to be soaking up much of my processor or ram - processor usage seems to be absorbing only 30 percent of capacity, even when nothing else is running. Are there any common mistakes on a dataset this size that might cause Amelia to run interminably or break silently? How could I adjust my settings to speed things up?