Separate features of main and inset plot?

Question

Separate features of main and inset plot?

pgseye opened this issue 9 months ago · comments

I am trying to illustrate some basic logistic regression concepts using plots. gg_magnify is almost perfect for what I want to do with the plot of predicted odds vs a continuous covariate (see figure). There are two issues and I'm assuming it's not easy to avoid them as gg_magnify inherits whatever you specify with ggplot. You'll notice with the inset that I've specified a y axis (which works great for the inset showing odds of 1 - 5) but because I want to highlight the full range of odds - which goes to up to several thousand - those numbers are overlapped on the main plot. The second issue is that I'm wanting to highlight the predicted odds at two set values of the covariate. This again works well using geom_label in the inset but I notice that the labels remain in the main plot, which I don't really need.

Any way around these two problems?

The full code to generate the plots is below:

library(tidyverse)
library(cowplot)
library(ggpubr)
library(ggmagnify)
library(emmeans)

# Simulate data ----
n <- 1000                    # don't change this unless necessary (plots might be fragile)
set.seed(1234)
x  <-  rnorm(n, 486, 142)    # generate macular hole inner opening data with mean 486 and sd = 152
z  <-  10.89 - 0.016 * x     # generate variable that is linear combination of intercept = 10.89 and coefficient for macular hole -0.016 (logit scale)
pr  <-  1/(1 + exp(-z))      # generate probabilities from this
y  <-  rbinom(n, 1, pr)      # generate outcome variable as a function of those probabilities

# Create dataframe from these:
df <-  data.frame(y = y, x = x, z = z, pr = pr)
df <- df |> 
  filter(x > 100) # only include those with thickness > 100

# Logistic regression model ----
summary(mod_logistic <- glm(y ~ x, data = df, family = "binomial"))

# Predictions ----
# Create new df to predict on new values of x
new_dat <- data.frame(x = seq(from = 0, to = 1200, length.out = 100))
# Predict new fitted values and SE's on logodds scale
pred_logodds <- predict(mod_logistic, newdata = new_dat, type = "link", se = TRUE)
new_dat <- cbind(new_dat, pred_logodds)
# Create new df of predictions
predictions <- new_dat |> 
  rename(pred_logodds_est = fit) |> 
  mutate(pred_logodds_LL = pred_logodds_est - (1.96 * se.fit),
         pred_logodds_UL = pred_logodds_est + (1.96 * se.fit)) |> 
  select(-c(se.fit, residual.scale))
# Predict new fitted values and SE's on odds scale
predictions <- predictions |> 
  mutate(pred_odds_est = exp(pred_logodds_est),
         pred_odds_LL = exp(pred_logodds_LL),
         pred_odds_UL = exp(pred_logodds_UL))
# Predict new fitted values and SE's on probability scale
pred_probs <- predict(mod_logistic, newdata = new_dat, type = "response", se = TRUE)
new_dat <- cbind(new_dat[1], pred_probs)
new_dat <- new_dat |> 
  mutate(pred_probs_LL = fit - (1.96 * se.fit),
         pred_probs_UL = fit + (1.96 * se.fit))
# Add predicted probs and CIs to predictions df
predictions <- cbind(predictions, 
                     pred_probs_est = new_dat$fit, 
                     pred_probs_LL = new_dat$pred_probs_LL,
                     pred_probs_UL = new_dat$pred_probs_UL)

# Plots ----
p1 <- ggplot(predictions, aes(x = x, y = pred_logodds_est)) + 
  geom_ribbon(aes(ymin = pred_logodds_LL, ymax = pred_logodds_UL), alpha = 0.2) + 
  geom_line(color = "cornflowerblue", linewidth = 1) +
  geom_point(data = df, aes(x = x, y = y), size = 2, alpha = 0.1) +
  annotate("text", x = 1150, y = 6, label = "log-odds", size = 10, color = "#E7B800") +
  scale_x_continuous(limits = c(0, 1200), breaks = seq(0, 1200, by = 100)) +
  scale_y_continuous(limits = c(-100, 100), breaks = seq(-100, 100, by = 2)) +
  coord_cartesian(xlim = c(0, 1200), ylim = c(-8, 8)) +
  geom_vline(xintercept = 600, color = "red", linetype = "dotted", linewidth = 0.6) +
  geom_vline(xintercept = 700, color = "red", linetype = "dotted", linewidth = 0.6) +
  geom_label(aes(x = 630, y = 2), label = "1.323", color = "red", size = 3) +
  geom_segment(aes(x = 615, y = 1.72, xend = 600, yend = 1.323, color = "red"), 
               linewidth = 0.2, alpha = 0.05, arrow = arrow(length = unit(0.01, "npc"))) +
  geom_label(aes(x = 670, y = -1), label = "-0.181", color = "red", size = 3) +
  geom_segment(aes(x = 686, y = -0.7, xend = 700, yend = -0.181, color = "red"), 
               linewidth = 0.2, alpha = 0.05, arrow = arrow(length = unit(0.01, "npc"))) +
  guides(color = "none") + 
  theme_bw(base_size = 20) +
  ylab("") +
  theme(axis.title.x = element_blank(), axis.text.x = element_blank())
p2 <- ggplot(predictions, aes(x = x, y = pred_odds_est)) + 
  geom_ribbon(aes(ymin = pred_odds_LL, ymax = pred_odds_UL), alpha = 0.2) + 
  geom_line(color = "cornflowerblue", linewidth = 1) +
  geom_point(data = df, aes(x = x, y = y), size = 2, alpha = 0.1) +
  annotate("text", x = 1150, y = 6000, label = "odds", size = 10, color = "#E7B800") +
  scale_x_continuous(limits = c(0, 1200), breaks = seq(0, 1200, by = 100)) +
  scale_y_continuous(limits = c(-20, 1000000), breaks = c(0, 1 ,2, 3, 4, 5, seq(1000, 1000000, by = 1000))) +
  coord_cartesian(xlim = c(0, 1200), ylim = c(0, 7000)) +
  geom_vline(xintercept = 600, color = "red", linetype = "dotted", linewidth = 0.6) +
  geom_vline(xintercept = 700, color = "red", linetype = "dotted", linewidth = 0.6) +
  geom_label(aes(x = 630, y = 4), label = "3.755", color = "red", size = 3) +
  geom_segment(aes(x = 615, y = 3.85, xend = 600, yend = 3.755, color = "red"), 
               linewidth = 0.2, alpha = 0.05, arrow = arrow(length = unit(0.01, "npc"))) +
  geom_label(aes(x = 730, y = 1.5), label = "0.834", color = "red", size = 3) +
  geom_segment(aes(x = 715, y = 1.34, xend = 700, yend = 0.834, color = "red"), 
               linewidth = 0.2, alpha = 0.05, arrow = arrow(length = unit(0.01, "npc"))) +
  guides(color = "none") + 
  ylab("") +
  theme_bw(base_size = 20) +
  theme(axis.title.x = element_blank(), axis.text.x = element_blank()) +
  geom_magnify(from = c(xmin = 500, xmax = 1000, ymin = 0, ymax = 5), 
               to = c(xmin = 480, xmax = 1000, ymin = 1000, ymax = 5000), shadow = T, axes = "xy")
p3 <- ggplot(predictions, aes(x = x, y = pred_probs_est)) + 
  geom_ribbon(aes(ymin = pred_probs_LL, ymax = pred_probs_UL), alpha = 0.2) + 
  geom_line(color = "cornflowerblue", linewidth = 1) +
  geom_point(data = df, aes(x = x, y = y), size = 2, alpha = 0.1) +
  annotate("text", x = 1150, y = 0.8, label = "probability", size = 10, color = "#E7B800") +
  scale_x_continuous(limits = c(0, 1200), breaks = seq(0, 1200, by = 100)) +
  scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, by = 0.2)) +
  geom_vline(xintercept = 600, color = "red", linetype = "dotted", linewidth = 0.6) +
  geom_vline(xintercept = 700, color = "red", linetype = "dotted", linewidth = 0.6) +
  ylab("") + xlab("Macular hole thickness") +
  theme_bw(base_size = 20)
ggarrange(p1, p2, p3, align = "v", ncol = 1)
ggsave("combined.pdf", width = 20, height = 16)
[combined.pdf](https://github.com/hughjonesd/ggmagnify/files/13574906/combined.pdf)

combined.pdf

David Hugh-Jones · Answer 1 · Wed Dec 06 2023 13:53:24 GMT+0800 (China Standard Time)

Can't you just use the plot argument to geom_magnify? See the README for an example.

David Hugh-Jones · Answer 2 · Wed Dec 06 2023 13:56:20 GMT+0800 (China Standard Time)

Closing but feel free to reopen if you disagree.

pgseye · Answer 3 · Wed Dec 06 2023 14:20:42 GMT+0800 (China Standard Time)

Absolutely awesome!! That works a treat and gives me exactly what I want. Thank you.