HerrMo / multi-omics_benchmark_study

Code to reproduce the multi-omics benchmark study

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Code to the multi-omics benchmark study by Herrmann et al.

This repo allows to

  • download the (preprocessed) data used in the study
  • reproduce the results (table, figures etc.) presented in the paper
  • rerun the entire benchmark experiment

If you use the code or data please cite:

Moritz Herrmann, Philipp Probst, Roman Hornung, Vindi Jurinovic, Anne-Laure Boulesteix, Large-scale benchmark study of survival prediction methods using multi-omics data, Briefings in Bioinformatics, Volume 22, Issue 3, May 2021, bbaa167, https://doi.org/10.1093/bib/bbaa167

To download the data:

  • The preprocessed data (described in the study) is available via OpenML
  • The OpenML dataset IDs can be found in data/datset_ids.txt or data/datset_ids.RData
  • Note that the datasets had to be split into two to three parts in order to be uploaded to OpenML
  • R users can use the code in R/bench_experiment.R (lines 44-81) to directly download the data (and convert it to mlr tasks)

To reprocude the results (in R):

  • to only reproduce the tables, figures etc. displayed in the paper without rerunning the benchmark experiments use reproduce_table-and-figures.Rmd
  • to rerun the full experiments (this takes several days or weeks, depending on the available resources) use R/bench_experiment.R
    • see the instructions in R/packages.R!
    • make sure the required packages are installed
    • make sure to use correct package versions via checkpoint
    • not all packages are covered by checkpoint, this is specifically relevant for mlr (s. R/packages.R)!
  • to merge the benchmark results use R/merge_bmr_results.R

Note, mlr has deprecated (https://github.com/mlr-org/mlr) in the meantime. There is now the new framework mlr3 (https://mlr3.mlr-org.com/).

About

Code to reproduce the multi-omics benchmark study


Languages

Language:R 100.0%