relogit does not adjust SEs or predicted probabilities

Question

relogit does not adjust SEs or predicted probabilities

anorthrup opened this issue 4 years ago · comments

The relogit method for zelig computes the same SEs as glm, and uses the same method as glm to calculate predicted probabilities. The only difference in the predicted probabilities in glm and zelig is the parameter estimates used to calculate them. The King and Zeng 2001 paper adjusts the standard errors as follows:

And the predicted probability (where x_0 is a 1 x k vector of chosen values of explanatory variables):

I first learned about this in this blog post, which provides a reproducible example: https://blog.methodsconsultants.com/posts/bias-adjustment-for-rare-events-logistic-regression-in-r/

I have since confirmed these results with my own data set. The column, "Manual calc with Zelig estimates" refers to the Zelig parameter estimates but with a standard logistic regression predicted probability calculation. The last two columns utilize the King and Zeng (2001) method for calculating predicted probabilities, with the first one using the SEs provided by the model, and the latter using SEs adjusted using the method in the paper.

The functions I used to calculate the above probabilities are here (with the first function being slightly modified from the referenced blog post):

logisticPred <- function (x, coefs) {
  x %>%
    add_column(int = 1, .before = 1) %>%
    mutate_all(list(~as.numeric(as.character(.)))) %>%
    as.matrix(.) %*% coefs %>%
    as.vector() %>%
    (function(xB) 1 / (1 + exp(-xB)))
}

reLogPred <- function (x, mdl, vcovAdj = TRUE) {
  coefs <- coef(mdl) %>%
    as.numeric
  vcovM <- vcov(mdl)[[1]] %>%
      as.matrix
  if(vcovAdj) vcovM <- vcovM * 
      (nrow(x) / (nrow(x) + length(coefs))) ^ 2
  x %>%
    add_column(int = 1, .before = 1) %>%
    mutate_all(list(~as.numeric(as.character(.)))) %>%
    mutate(
      p    = logisticPred(x, coefs),
      zeta = (.5-p)*p*(1-p),
      eta  = ((select(., everything()) %>% as.matrix) %*% vcovM %*% 
                (select(., everything()) %>% t)) %>%
        diag,
      p_re = p + zeta * eta
    ) %>%
    select(p_re) %>%
    unlist
}

Thank you for your consideration!

Adam