the tergm workshop EGMME example is taking a REALLY long time again

Question

the tergm workshop EGMME example is taking a REALLY long time again

martinamorris opened this issue 3 years ago · comments

i raised this a couple of wks back -- i think in an email? in any case, we decided there was a lot of stochastic variability in the fitting times and dropped it.

i'm not sure if anything material has changed in tergm since then, but that same model is now taking what seems like forever to fit. it seems like forever b/c i haven't gotten it to converge yet, so i've stopped it a couple of times after 20 min or so and started again to see if it was a fluke. but no luck.

@krivit can you test on your machine again before you release tergm?

note: i've commented out the interactive progress plot, in case that was slowing things down. it may be, but even without it i can't get the model to converge.

library(tergm)
data(florentine)

startTime <- Sys.time()
tergm.fit.1 <- tergm(
  flobusiness ~ 
    Form(~ edges + gwesp(0, fixed=T)) + 
    Diss(~ offset(edges)),
  targets = "formation",
  offset.coef = log(9),
  estimate = "EGMME"#,
  #control = control.tergm(SA.plot.progress=TRUE)
  )
stopTime <- Sys.time()
print(paste("Estimation time:", stopTime-startTime))

Martina Morris · Answer 1 · Wed Jun 23 2021 14:43:03 GMT+0800 (China Standard Time)

an hour later...

========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.5 \\\\/\/\/\////////////\/\\\/\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.6 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.7 \//////////////////////////////////////////////////////////////////////////////////\\\/\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.8 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.9 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.10 /////////////////////////////////////////////////////////////\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.11 \\\//\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.12 /////\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.13 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.14 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.15 //\\/\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.16 ///////\\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.17 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.18 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.19 \\\\\
========  Phase 3: Simulate from the fit and estimate standard errors. ========
Subphase 2.20 ////////////////////////////////////////////////////////////////////////////

Martina Morris · Answer 2 · Wed Jun 23 2021 15:26:23 GMT+0800 (China Standard Time)

welp. it's 12:25am now, and it still hasn't finished. i'll leave it running, but i'm off to bed. something is clearly wrong.

Pavel N. Krivitsky · Answer 3 · Wed Jun 23 2021 17:55:15 GMT+0800 (China Standard Time)

What happened is that exactly what we were worried about before with respect to Diss() vs Persist() had come back to bite us. Unlike the old dissolve= and the new Persist(), for Diss(), higher parameters means more dissolution, which means log(9) dissolution coefficient implies that each edge that exists at time t has a 10% chance of surviving to t+1.

Just change log(9) to -log(9), and it finishes in 3-5 minutes.

Martina Morris · Answer 4 · Thu Jun 24 2021 00:46:12 GMT+0800 (China Standard Time)

jeez. thx.

Martina Morris · Answer 5 · Thu Jun 24 2021 00:51:47 GMT+0800 (China Standard Time)

that said, it's still surprising to me that this simple model, on a tiny network, would take 3-5 minutes.

is this the full stergm estimation curse? or is there some possibility of improvment?

Pavel N. Krivitsky · Answer 6 · Thu Jun 24 2021 07:00:34 GMT+0800 (China Standard Time)

It's a model that dissolves almost 9/10 of the ties during a given time step. It may well be the case that it can't actually reach the target statistics.