joshwlambert / DAISIEmainland

Simulate phylogenetic data on islands with a evolving mainland pool

Home Page:https://joshwlambert.github.io/DAISIEmainland/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Plot all recolonisations in 'plot_daisie_mainland_data'

richelbilderbeek opened this issue · comments

From Issue 62 here I quote:

[...] In this example the ideal data should plot a recolonisation, but only plots one of the species from the all_colonisations.

set.seed(
  1,
  kind = "Mersenne-Twister",
  normal.kind = "Inversion",
  sample.kind = "Rejection"
)

daisie_mainland_data <- DAISIEmainland::sim_island_with_mainland(
  total_time = 1,
  m = 50,
  island_pars = c(1.0, 0.5, 10, 0.1, 0.5),
  mainland_ex = 2,
  mainland_sample_prob = 1,
  mainland_sample_type = "complete",
  replicates = 1,
  verbose = FALSE
)

DAISIEmainland::plot_daisie_mainland_data(
  daisie_mainland_data = daisie_mainland_data,
  replicate_index = 1
)

image

So, this is how the plot looks like, after fixing #64:

Screenshot from 2022-01-24 18-52-52

The first notes:

Screenshot from 2022-01-24 18-58-04

In code:

>   names(daisie_mainland_data$empirical_multi_daisie_data[[1]][[3]])
[1] "branching_times" "stac"            "missing_species"
>   names(daisie_mainland_data$ideal_multi_daisie_data[[1]][[3]])
[1] "branching_times"   "stac"              "missing_species"  
[4] "all_colonisations"

And:

>   daisie_mainland_data$empirical_multi_daisie_data[[1]][[3]]
$branching_times
[1] 1.0000000 0.7579685 0.6003790

$stac
[1] 2

$missing_species
[1] 0

which should look like:

issue_68

>   daisie_mainland_data$ideal_multi_daisie_data[[1]][[3]]
$branching_times
[1] 1.000000 0.600379

$stac
[1] 3

$missing_species
[1] 0

$all_colonisations
$all_colonisations[[1]]
$all_colonisations[[1]]$event_times
[1] 1.000000 0.600379

$all_colonisations[[1]]$species_type
[1] "A"


$all_colonisations[[2]]
$all_colonisations[[2]]$event_times
[1] 1.0000000 0.1847087

$all_colonisations[[2]]$species_type
[1] "A"

which should look like (note that the lines are from the same species, hence the same color):

issue_68

Now with some notes:

Screenshot from 2022-01-24 19-15-12

Indeed, the recolonisation does not show up in the result of daisie_data_to_tables:

Screenshot from 2022-01-24 19-49-05

Screenshot from 2022-01-24 19-15-12

There are two versions shown, the upper is the one-line-per-re-/-colonist, the lower is the one-line-is-beautiful setup.

@joshwlambert: am I correct that the figure above shows how it should like like? If yes, which version do you prefer?

  • The version that is easiest to implement for you
  • The upper version
  • The lower version

Per email, Josh picked the lower version.

@richelbilderbeek Just to clarify, the lower version is the one line per colonist and not the recolonists plotted on the same line?

@joshwlambert, happy to clarify and be checked :-)

The 'lower version' I meant is like this (I adapted the picture from above) this picture, where a clade would like like this:

150846629-68a2c8b1-24ca-4e22-bbd7-a5c76d8890ad

What happened is this clade:

  • colonisation at t = 0.6
  • re-colonisation at t = 0.2
  • no branching events

So, does the drawing match expectations?

@joshwlambert well, maybe there is a branching event. Question is, how would you plot this DAISIE data object (it is daisie_mainland_data$ideal_multi_daisie_data[[1]][[3]] in the tests above):

$branching_times
[1] 1.000000 0.600379

$stac
[1] 3

$missing_species
[1] 0

$all_colonisations
$all_colonisations[[1]]
$all_colonisations[[1]]$event_times
[1] 1.000000 0.600379

$all_colonisations[[1]]$species_type
[1] "A"


$all_colonisations[[2]]
$all_colonisations[[2]]$event_times
[1] 1.0000000 0.1847087

$all_colonisations[[2]]$species_type
[1] "A"

@richelbilderbeek I would plot the two colonisations in the all_colonistations as two separate lines. The colour of which would depend on their endemicity given by the species_type in all_colonisations.

@joshwlambert so this would be a good sketch?

like_this

OK, zooming in on the problem, I added this test to test-plot_daisie_data.R:

test_that("Issue #68: plot all recolonisations", {
  set.seed(
    1,
    kind = "Mersenne-Twister",
    normal.kind = "Inversion",
    sample.kind = "Rejection"
  )
  
  daisie_mainland_data <- DAISIEmainland::sim_island_with_mainland(
    total_time = 1,
    m = 50,
    island_pars = c(1.0, 0.5, 10, 0.1, 0.5),
    mainland_ex = 2,
    mainland_sample_prob = 1,
    mainland_sample_type = "complete",
    replicates = 1,
    verbose = FALSE
  )
  daisie_data <- daisie_mainland_data$ideal_multi_daisie_data[[1]]
  plot_daisie_data(daisie_data)
})

This is the problem:

annotated

Using vertical lines solves the problem in a way:

Screenshot from 2022-01-26 19-24-28

@richelbilderbeek I have accepted the PR. Thanks for the nice plots. However, I would like the recolonists plotted as separate lines, each with their own colonisation time and not linked by any branching events. Would this be possible.

@joshwlambert I think yes, as Saturday I seem to have plenty of time to fix that, as well as fixing the vertical branches being misplaced!

This will be harder: the algorithm to spread out branches is one thing, doing the same for colonists, that may branch out as well, may be hard! Let's first write a solid test 👼

This is an interesting setting:

test_that("Issue #68: plot all recolonisations with many branches", {
  set.seed(
    631,
    kind = "Mersenne-Twister",
    normal.kind = "Inversion",
    sample.kind = "Rejection"
  )

  daisie_mainland_data <- DAISIEmainland::sim_island_with_mainland(
    total_time = 1,
    m = 10,
    island_pars = c(1, 1, 10, 0.1, 1),
    mainland_ex = 1,
    mainland_sample_prob = 1,
    mainland_sample_type = "complete",
    replicates = 1,
    verbose = FALSE
  )
  DAISIEmainland::plot_daisie_mainland_data(
    daisie_mainland_data = daisie_mainland_data,
    replicate_index = 1
  )
  # Plots nicely
  daisie_data <- daisie_mainland_data$ideal_multi_daisie_data[[1]]
  plot_daisie_data(daisie_data)

  # Plots nicely when there are no colonisations
  daisie_data <- daisie_mainland_data$empirical_multi_daisie_data[[1]]
  plot_daisie_data(daisie_data)
})

Screenshot from 2022-02-20 19-05-28

Data:

> daisie_data
[[1]]
[[1]]$island_age
[1] 1

[[1]]$not_present
[1] 9


[[2]]
[[2]]$branching_times
[1] 1.0000000 0.9115751 0.5076183 0.3368938

[[2]]$stac
[1] 3

[[2]]$missing_species
[1] 0

[[2]]$all_colonisations
[[2]]$all_colonisations[[1]]
[[2]]$all_colonisations[[1]]$event_times
[1] 1.0000000 0.9115751 0.5076183 0.3368938

[[2]]$all_colonisations[[1]]$species_type
[1] "C"


[[2]]$all_colonisations[[2]]
[[2]]$all_colonisations[[2]]$event_times
[1] 1.000000 0.344173

[[2]]$all_colonisations[[2]]$species_type
[1] "A"

OK, that one should display as:

issue_68

Here is the best one yet:

test_that("Multiple recolonisations", { # nolint indeed, this is complex :-)
  skip("Only run locally")
  seed <- 1912
  set.seed(
    seed,
    kind = "Mersenne-Twister",
    normal.kind = "Inversion",
    sample.kind = "Rejection"
  )
  daisie_mainland_data <- sim_island_with_mainland(
    total_time = 1,
    m = 10,
    island_pars = c(1, 0.1, 30.0, 1.0, 5.0),
    mainland_ex = 1,
    mainland_sample_prob = 1,
    mainland_sample_type = "complete",
    replicates = 1,
    verbose = FALSE
  )
  ideal_daisie_data <- daisie_mainland_data$ideal_multi_daisie_data[[1]]
  clade_index <- 9
  n_colonisations <- length(ideal_daisie_data[[clade_index]]$all_colonisations) # nolint indeed a long line
  n_branches <- length(ideal_daisie_data[[clade_index]]$branching_times) - 1 # nolint indeed a long line
  expect_true(n_colonisations >= 3 && n_branches >= n_colonisations * 2)
  interesting_clade <- ideal_daisie_data[clade_index]
  plot_daisie_data(ideal_daisie_data)

  simplified_ideal_daisie_data <- list()
  simplified_ideal_daisie_data[[1]] <- ideal_daisie_data[[1]]
  simplified_ideal_daisie_data[[2]] <- ideal_daisie_data[[9]]
  plot_daisie_data(daisie_data = simplified_ideal_daisie_data)
})

with

> simplified_ideal_daisie_data
[[1]]
[[1]]$island_age
[1] 1

[[1]]$not_present
[1] 5


[[2]]
[[2]]$branching_times
[1] 1.00000000 0.81360371 0.59927594 0.54433504 0.43080142 0.11137935 0.04632135

[[2]]$stac
[1] 3

[[2]]$missing_species
[1] 0

[[2]]$all_colonisations
[[2]]$all_colonisations[[1]]
[[2]]$all_colonisations[[1]]$event_times
[1] 1.0000000 0.8136037 0.5992759 0.4308014

[[2]]$all_colonisations[[1]]$species_type
[1] "C"


[[2]]$all_colonisations[[2]]
[[2]]$all_colonisations[[2]]$event_times
[1] 1.00000000 0.54433504 0.04632135

[[2]]$all_colonisations[[2]]$species_type
[1] "C"


[[2]]$all_colonisations[[3]]
[[2]]$all_colonisations[[3]]$event_times
[1] 1.0000000 0.3858606 0.1113793

[[2]]$all_colonisations[[3]]$species_type
[1] "C"

I guess this is how it should look?

issue_68

Getting there:

Screenshot from 2022-03-02 16-51-09

Done! It is nearly beautiful:

Screenshot from 2022-03-03 19-58-24