speed up list of slow models to try to add back to test set

Question

speed up list of slow models to try to add back to test set

k-doering-NOAA opened this issue 2 years ago · comments

These models are not run currently in the est job, because they take too long to run:

https://github.com/nmfs-stock-synthesis/workflows/blob/4eb01e69d71dac765310d90b547666830b2e5d35/.github/workflows/run-ss3-with-est.yml#L52-L57

rm_mods <- c("BlueDeaconRF_2017", "BlackRf_2015_WA", "China_2015_Central",
                     "Spinydogfish_2011", "YellowtailRF_2017", "Bocaccio_2015",
                     "Darkblotched_2015", "POP_2017", "seasonal_with_size_comp",
                     "Widow_2015", "YelloweyeRF_2017", "CanaryRf_2015",
                     "Petrale2015", "three_area_double_normal_selex",
                     "Sablefish2015")

It would be great to try to speed these up if possible so that they are run with estimation.

List of models:

~~BlueDeaconRF_2017~~ (removed b/c no unique features)
~~BlackRf_2015_WA~~ (removed b/c no unique features)
~~China_2015_Central~~ (removed b/c no unique features)
Spinydogfish_2011 - sped up to under 5 min by changing start year from 1916 to 1980 and some simple modifications.
~~YellowtailRF_2017~~ (removed b/c no unique features)
~~Bocaccio_2015~~ (removed b/c no unique features)
~~Darkblotched_2015~~ (removed b/c no unique features)
~~POP_2017~~ (removed b/c no unique features)
~~seasonal_with_size_comp~~ (removed due to lack of unique features)
~~Widow_2015~~ (removed because added maturity option 2 to simple lorenzen instead.)
~~YelloweyeRF_2017~~ (removed b/c no unique features)
~~CanaryRf_2015~~ (removed b/c annual deviations are already used in other models for other parameters)
~~Petrale2015~~ (removed b/c no unique features)
~~three_area_double_normal_selex~~ (don't see this model in test set, renamed?)
Sablefish2015 (has special survey and unique sel. patterns; could maybe remove some or all CAALs to speed up?) - was able to speed up by reducing length of ts; still runs for 15 min, so may need more work.
deleted three_area model due to lack of unique features

TODO

Make sure selectivity time block captured in test set? (already have q time blocks tho)
Make sure -year and -fleet values are included in datasets.
make sure lambdas used.
Add catch multiplier to one of the test set.
All prior types used.

Kathryn Doering · Answer 1 · Tue Apr 26 2022 23:03:15 GMT+0800 (China Standard Time)

Prioritize ones with unique features, like seasonal with size comp, three area dobule normal. Then, taking out some data from ones derived from real assessments may help speed up the other ones.

Richard Methot · Answer 2 · Tue Apr 26 2022 23:22:09 GMT+0800 (China Standard Time)

more wc gfish files are less likely to have unique features. Better to look for a tuna setup (areas, generalized size comp), or a Max setup (devs, env, predation)

Kathryn Doering · Answer 3 · Thu Apr 28 2022 04:58:02 GMT+0800 (China Standard Time)

Most of these slow models were removed b/c they don't have any different features compared to faster models already in the test suite. There were a few that did have unique features, so I will focus next on either speeding up the models or adding those unique features into a faster model within the test set.

Kathryn Doering · Answer 4 · Wed May 04 2022 02:59:06 GMT+0800 (China Standard Time)

the est job passes and now runs all models in the test set. Run time is 1 hour, while before it was 35 min. I think we are actually running more models with est than we were previously, though. Continuing to work on reducing run times would be helpful in the future.

Richard Methot · Answer 5 · Wed May 04 2022 03:06:27 GMT+0800 (China Standard Time)

If run with -noest fails, then it is a waste of time to proceed to run with est. So, can the gha abort when it gets a fail?

Kathryn Doering · Answer 6 · Wed May 04 2022 04:23:52 GMT+0800 (China Standard Time)

I thought about linking them like that, but I think decided not to because then the est run won't start until the noest one finishes. I think I also struggled with how to implement it technically at the time, but might be able to now.

I will open an issue on this in the workflows repo.

Richard Methot · Answer 7 · Wed May 04 2022 04:52:08 GMT+0800 (China Standard Time)

Ahh. I forgot about aspect of them running in parallel