Simulation of SADs

Question

Simulation of SADs

FelixMay opened this issue 7 years ago · comments

Question by @dmcglinn: Is it necessary to specify if we are drawing S species from an infinite or truncated to N log-normal? This was relevant when working with the log-series distribution with Harte’s METE where we input the size of the pool and the number of individuals. In other words the normalization of the pdf is across 1 to N and not 1 to infinity.

Felix May · Answer 1 · Tue Apr 11 2017 17:11:35 GMT+0800 (China Standard Time)

@dmcglinn: Dan, I am not really sure. If this is important I suggest you check the code of sim_sad (and potentially of sads::rsad) I use a untruncated relative abundance distribution and then sample from it, until the required number of individuals is reached?

Dan McGlinn · Answer 2 · Tue Apr 11 2017 20:27:30 GMT+0800 (China Standard Time)

Hey @rueuntal do you think it is important that we draw from a truncated log-normal when specifying N individuals? In other words does it change the shape of the SAD much to simply use an infinite log-normal prob this is more important when N is small.

Xiao Xiao · Answer 3 · Wed Apr 12 2017 09:55:40 GMT+0800 (China Standard Time)

Hey @dmcglinn @FelixMay sorry to be late to the discussion, and please correct me if I'm wrong. I tend to think there are three "sides" when simulating an SAD - S, N, and the shape. You cannot simultaneously control all three. The implementation that Dan and I are probably more familiar with is one where S and the the shape are controlled for (ie drawing S species from a designated distribution), while N is allowed to vary to some extent. This is also the case for the regional pool in Felix's implementation (abund1 in code). And to get the local SAD (abund2) I think Felix controls for N, which would lead to either not all S species being present, or a slight distortion in the shape of the SAD. It doesn't look to me to be truncated per se, but there are definitely some nuanced differences. Personally I think either implementation is fine and they are probably suitable for different purposes.

Felix May · Answer 4 · Wed Apr 12 2017 16:53:39 GMT+0800 (China Standard Time)

Hello @dmcglinn and @rueuntal, many thanks for the input Xiao! I agree to everything you write, but I also think that I implemented both options you mentioned in sim_sad when the argument fix_s_sim = FALSE (the default) then only N is controlled for. In this case there will be usually less species in the local community than in the pool and this difference will increase with the number of very rare species, i.e. the difference betwenn S_pool and S_local depends on the SAD shape of the pool. If fix_s_sim = TRUE then S_local will equal S_pool, but in this case there might be differences in the SAD shape between pool and local community. Basically the simulation adds more rare species and in turn reduces the abundance of the more common ones in order to control for N at the same time. Does this make sense?

Xiao Xiao · Answer 5 · Wed Apr 12 2017 21:48:52 GMT+0800 (China Standard Time)

Hi @FelixMay - Yes, it makes perfect sense, and I think it's great that the code can be implemented in both ways. @dmcglinn are you suggesting that there needs to be an option where the regional SAD is generated from a truncated distribution instead of an untruncated one? My feeling is that that wouldn't be necessary, as we don't know how many individuals there are in the regional pool (and hence wouldn't be able to pin down the bound Nmax at the regional scale), while the abundance at the local scale is already strictly controlled for. What do you think?

Dan McGlinn · Answer 6 · Wed Apr 12 2017 21:51:59 GMT+0800 (China Standard Time)

Thanks for the input @rueuntal! No I wasn't suggesting that. I was just wondering if there is a big difference between sampling S species from an infinite lognormal vs sampling S species from a finite lognormal. It sounds like there is not so we should be all good.

Xiao Xiao · Answer 7 · Wed Apr 12 2017 21:58:45 GMT+0800 (China Standard Time)

I see, sorry about the misunderstanding! I think what truncation does is to prevent the abundances from running super high, eg it's possible (though highly unlikely) to draw a species with 1 million individuals from the untruncated distribution but not truncated distribution below that. But my feeling is that it should be allowed at the regional scale.