kreutz-lab / DIMAR

Data-driven selction of an imputation algorithm in R

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DIMAR

Data-driven selection of an imputation algorithm in R.

For further inforamtion refer to the publication Egert et al. (2021) or the DIMAR wiki.

Installation

For installation of DIMAR run:

library(remotes, quietly = T)
remotes::install_github("kreutz-lab/DIMAR")

Examples

DIMA can take a numeric matrix or the file path to a MaxQuant ProteinGroups file as an input. The data is reduced to the columns which include pattern in their sample names. The imputation algorithms can be defined by the user, by default the nine most frequently selected algorithms of 142 Pride data sets are applied.

MaxQuant file path as input (Example taken from (Reimann et al. (2020))):

library(DIMAR)
filename <- "Test1.txt"
filepath <- system.file("extdata", filename, package = "DIMAR")
Imp <- DIMAR::dimar(mtx = filepath, pattern = 'Intensity', group = c('PKB','PKC'))

Matrix as input (Example taken from Khoonsari, Payam Emami, et al. "Analysis of the cerebrospinal fluid proteome in Alzheimer's disease." PloS one 11.3 (2016): e0150672.):

library(DIMAR)
library(openxlsx)
filename2 <- "Test2.xlsx"
filepath2 <- system.file("extdata", filename2, package = "DIMAR")
df <- openxlsx::read.xlsx(filepath2, sheet="Sheet1", startRow = 2) 
row.names(df) <- paste(c(1:nrow(df)), df$`Protein.(Uniprot.ID)`, sep = "_") 
mtx <- as.matrix(df[, grepl("^AD\\d|^C\\d", names(df))])
Imp2 <- DIMAR::dimar(mtx = mtx, pattern = NULL)

Same example with defining the imputation algorithms:

Imp2 <- DIMAR::dimar(mtx = mtx, pattern = "^AD\\d|^C\\d", methods = c('impSeqRob','impSeq','missForest','imputePCA','ppca','bpca'))

About

Data-driven selction of an imputation algorithm in R


Languages

Language:R 100.0%