morinlab / GAMBLR.data

Collection of Curated Data for Genomic Analysis of Mature B-cell Lymphomas in R

Home Page:https://morinlab.github.io/GAMBLR.data/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug in get_ssm_by_sample

mattssca opened this issue · comments

It seems like this commit introduced a few bugs to the alias created for get_ssm_by_samples. For example, when calling GAMBLR.data::get_ssm_by_sample the function seems to disregard the specified parameters since the values are hardcoded in the mother function (get_ssm_by_samples) in the alias creation. For instance, here I am requesting hg38 and get back unprefixed chromosomes (this is how the potential bug was caught).

> head(get_ssm_by_sample(this_sample_id = "DOHH-2", projection = "hg38") %>% select(Chromosome))
Using the bundled SSM calls (.maf) calls in GAMBLR.data...
Using the bundled metadata in GAMBLR.data...
# A tibble: 6 × 1
  Chromosome
  <chr>     
1 1         
2 1         
3 1         
4 1         
5 1         
6 1     

In addition, this would further strengthen my argument that parameters are ignored in the updated version of this function;

> ncol(get_ssm_by_sample(this_sample_id = "DOHH-2", projection = "hg38", basic_columns = FALSE, maf_cols = "Chromosome"))
Using the bundled SSM calls (.maf) calls in GAMBLR.data...
Using the bundled metadata in GAMBLR.data...
[1] 45

In the example above, I am requesting only one column back (Chromosome) but this is overwritten by the function in the alias creation;

get_ssm_by_sample = function(this_sample_id = NULL,
                             these_samples_metadata = NULL,
                             this_seq_type = "genome",
                             projection = "grch37",
                             these_genes,
                             min_read_support = 3,
                             basic_columns = TRUE,
                             maf_cols = NULL,
                             verbose = FALSE,
                             ...){
  
  get_ssm_by_samples(these_sample_ids = this_sample_id,
                     these_samples_metadata = NULL,
                     this_seq_type = "genome",
                     projection = "grch37",
                     these_genes,
                     min_read_support = 3,
                     basic_columns = TRUE,
                     maf_cols = NULL,
                     verbose = FALSE,
                     ...)
}

If I update the alias to pass all the parameters specified with get_ssm_by_sample to get_ssm_by_samples we get the expected behavior (same examples executed with the udpated code).

> head(get_ssm_by_sample(this_sample_id = "DOHH-2", projection = "hg38") %>% select(Chromosome))
Using the bundled SSM calls (.maf) calls in GAMBLR.data...
Using the bundled metadata in GAMBLR.data...
# A tibble: 6 × 1
  Chromosome
  <chr>     
1 chr1      
2 chr1      
3 chr1      
4 chr1      
5 chr1      
6 chr1    

and

> ncol(get_ssm_by_sample(this_sample_id = "DOHH-2", projection = "hg38", basic_columns = FALSE, maf_cols = "Chromosome"))
Using the bundled SSM calls (.maf) calls in GAMBLR.data...
Using the bundled metadata in GAMBLR.data...
[1] 1