Discrepancies in "Lymph Gen" column names between the two bundled metadata objects.
mattssca opened this issue · comments
In the metadata, bundled in sample_data the lymph gen column is called "LymphGen" whereas in the all-so-familiar output from get_gambl_metadata
and the bundled metadata in this package, refer to the same column as "lymphgen".
This discrepancy is causing errors if a user access the metadata in the list (sample_data) and pipes it to other GAMBL functions as a regular metadata object.
> any(names(get_gambl_metadata()) == 'LymphGen')
[1] FALSE
> any(names(get_gambl_metadata()) == 'lymphgen')
[1] TRUE
> any(names((GAMBLR.data::sample_data$meta)) == 'LymphGen')
[1] TRUE
> any(names((GAMBLR.data::sample_data$meta)) == 'lymphgen')
[1] FALSE
#get data
dohh2_maf = GAMBLR.data::sample_data$grch37$maf %>% dplyr::filter(Tumor_Sample_Barcode == "DOHH-2")
dohh2_meta = GAMBLR.data::sample_data$meta %>% dplyr::filter(sample_id == "DOHH-2")
#build plot
ashm_rainbow_plot(this_maf = dohh2_maf, metadata = dohh2_meta, region = "chr6:90975034-91066134")
Error in `$<-.data.frame`(`*tmp*`, classification, value = integer(0)) : replacement has 0 rows, data has 1
Has this issue been resolved?
No, but I can definitely address this today 🙂
Ok, awesome! 😎
The column Sex
is also discrepant (capitalised in one and all lower case in other) - so has to be addressed too 👍
Should be fixed now, will push this to PR
> colnames(sample_data$meta)
[1] "patient_id" "sample_id" "Tumor_Sample_Barcode"
[4] "seq_type" "sex" "COO_consensus"
[7] "lymphgen" "genetic_subgroup" "EBV_status_inf"
[10] "cohort" "pathology" "reference_PMID"
> colnames(sample_data$meta)[!colnames(sample_data$meta) %in% colnames(GAMBLR.data::gambl_metadata)]
[1] "genetic_subgroup" "reference_PMID"
This is now fixed in #30