r-lib / vctrs

Generic programming with typed R vectors

Home Page:https://vctrs.r-lib.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

internal error in vctrs package when using perf() function (mixOmics package)

SimonNusi opened this issue · comments

Hello,
I am using mixOmics package to perform a multi-omics data integration analysis (block.plsda) and have the following error message related to vctrs package. I report the error here as written in the message.
X <- list(metabo = alldata[, c(104:155,173:270)]
, prot = alldata[, c(2:36)]
, SNPs = alldata[, c(37:59,76:103)])
Y <- alldata$Beck.orig; table(Y)
design <- matrix(0.3, ncol = length(X), nrow = length(X),
dimnames = list(names(X), names(X)))
diag(design) <- 0; design
tune.diablo <- block.plsda(X, Y, ncomp = 5, design = design)
perf.diablo <- perf(tune.diablo, validation = 'Mfold', folds = 5, nrepeat = 10)
Error in filter():
ℹ In argument: row_number(x) == n().
ℹ In group 1: Group.2 = "10".
Caused by error in vec_rank():
! Unsupported vctrs type null.
ℹ In file type-info.c at line 189.
ℹ This is an internal error that was detected in the vctrs package.
Please report it at https://github.com/r-lib/vctrs/issues with a reprex and the full backtrace.
Backtrace:

  1. ├─mixOmics::perf(...)
  2. ├─mixOmics:::perf.sgccda(...)
  3. │ └─base::lapply(...)
  4. │ └─mixOmics (local) FUN(X[[i]], ...)
  5. │ └─mixOmics (local) repeat_cv_perf.diablo(nrep)
  6. │ └─base::lapply(...)
  7. │ └─mixOmics (local) FUN(X[[i]], ...)
  8. │ └─mixOmics:::predict.block.spls(model[[x]], X.test[[x]], dist = "all")
  9. │ └─mixOmics:::internal_predict.DA(...)
  10. │ ├─dplyr::filter(.data = data_max, row_number(x) == n())
  11. │ └─dplyr:::filter.data.frame(.data = data_max, row_number(x) == n())
  12. │ └─dplyr:::filter_rows(.data, dots, by)
  13. │ └─dplyr:::filter_eval(...)
  14. │ ├─base::withCallingHandlers(...)
  15. │ └─mask$eval_all_filter(dots, env_filter)
  16. │ └─dplyr (local) eval()
  17. ├─dplyr::row_number(x)
  18. │ └─vctrs::vec_rank(x, ties = "sequential", incomplete = "na")
  19. └─rlang:::stop_internal_c_lib(...)
  20. └─rlang::abort(message, call = call, .internal = TRUE, .frame = frame)
    Thank you,
    Simon

! Unsupported vctrs type null.

suggests that somewhere in perf() you end up with a data frame that has a NULL column in it, which is a corrupt data frame.

I'd suggest asking the mixOmics author about this, or providing a full reprex (see below):


Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page.

You can install reprex by running (you may already have it, though, if you have the tidyverse package installed):

install.packages("reprex")

Thanks

Hi Davis,
Thanks for you reply.
I have indeed sent a message to mixOmics author's. Waiting for their reply.
Please find below a reprex. This is the first time I am using it, so let me know if I am giving the right information.
Thanks a lot for taking time to look at this issue, I really appreciate.
Simon

library(mixOmics)
#> Loading required package: MASS
#> Loading required package: lattice
#> Loading required package: ggplot2
#> 
#> Loaded mixOmics 6.24.0
#> Thank you for using mixOmics!
#> Tutorials: http://mixomics.org
#> Bookdown vignette: https://mixomicsteam.github.io/Bookdown
#> Questions, issues: Follow the prompts at http://mixomics.org/contact-us
#> Cite us:  citation('mixOmics')

source('C:/Users/Common/Documents/PROJECTS SERI/Metabolic profile ED/Effect statin/SCRIPTS/Proteomics/new scripts/prepData_multiOmics.R')
#> Loading required package: foreign
#> Loading required package: survival
#> Loading required package: nnet
#> 
#> Attaching package: 'epiDisplay'
#> The following object is masked from 'package:ggplot2':
#> 
#>     alpha
#> The following object is masked from 'package:lattice':
#> 
#>     dotplot
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
#> 
#> Attaching package: 'lmtest'
#> The following object is masked from 'package:epiDisplay':
#> 
#>     lrtest
#> Loading required package: carData
#> Use the command
#>     lattice::trellis.par.set(effectsTheme())
#>   to customize lattice options for effects plots.
#> See ?effectsTheme for details.
#> Loading required package: Matrix
#> Loaded glmnet 4.1-8
#> corrplot 0.92 loaded
#> 
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#> 
#>     format.pval, units
#> Loading required package: grid
#> Loading required package: checkmate
#> Loading required package: abind
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:Hmisc':
#> 
#>     deff
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart
#>  cobalt (Version 4.5.1, Build Date: 2023-04-27)
#> 
#> Attaching package: 'cobalt'
#> The following object is masked from 'package:MatchIt':
#> 
#>     lalonde
#> Warning in read.dta13(paste(path, "Data SEED/SEED DATA/SEED-1/SiMES-SINDI-SCES-Essential-13Sep18.dta", : 
#>    Missing factor labels for variables
#> 
#>    L_NDR_cat
#> 
#>    No labels have been assigned.
#>    Set option 'generate.factors=TRUE' to generate labels.

## prepare the data
alldata <- subset(alldata, Beck.orig!='Late AMD')
alldata$Beck.orig <- droplevels(alldata$Beck.orig)
alldata[, c(37:59,76:103)] <- data.frame(scale(alldata[, c(37:59,76:103)], center = T, scale = T))
X <- list(metabo = alldata[, c(104:155,173:270)]  
          , prot = alldata[, c(2:36)]
          , SNPs = alldata[, c(37:59,76:103)]
)
Y <- alldata$Beck.orig; table(Y)
#> Y
#>     No AMD  Early AMD Interm AMD 
#>        103         64         56

## matrix design
design <- matrix(0.3, ncol = length(X), nrow = length(X), 
                 dimnames = list(names(X), names(X)))
diag(design) <- 0; design 
#>        metabo prot SNPs
#> metabo    0.0  0.3  0.3
#> prot      0.3  0.0  0.3
#> SNPs      0.3  0.3  0.0

## run the model
tune.diablo <- block.plsda(X, Y, ncomp = 5, design = design)
#> Design matrix has changed to include Y; each block will be
#>             linked to Y.
perf.diablo <- perf(tune.diablo, validation = 'Mfold', folds = 5, nrepeat = 10)
#> Error in `filter()`:
#> ℹ In argument: `row_number(x) == n()`.
#> ℹ In group 1: `Group.2 = "100"`.
#> Caused by error in `vec_rank()`:
#> ! Unsupported vctrs type `null`.
#> ℹ In file 'type-info.c' at line 189.
#> ℹ This is an internal error that was detected in the vctrs package.
#>   Please report it at <https://github.com/r-lib/vctrs/issues> with a reprex (<https://tidyverse.org/help/>) and the full backtrace.
#> Backtrace:
#>      ▆
#>   1. ├─mixOmics::perf(...)
#>   2. ├─mixOmics:::perf.sgccda(...)
#>   3. │ └─base::lapply(...)
#>   4. │   └─mixOmics (local) FUN(X[[i]], ...)
#>   5. │     └─mixOmics (local) repeat_cv_perf.diablo(nrep)
#>   6. │       └─base::lapply(...)
#>   7. │         └─mixOmics (local) FUN(X[[i]], ...)
#>   8. │           └─mixOmics:::predict.block.spls(model[[x]], X.test[[x]], dist = "all")
#>   9. │             └─mixOmics:::internal_predict.DA(...)
#>  10. │               ├─dplyr::filter(.data = data_max, row_number(x) == n())
#>  11. │               └─dplyr:::filter.data.frame(...)
#>  12. │                 └─dplyr:::filter_rows(.data, dots, by)
#>  13. │                   └─dplyr:::filter_eval(...)
#>  14. │                     ├─base::withCallingHandlers(...)
#>  15. │                     └─mask$eval_all_filter(dots, env_filter)
#>  16. │                       └─dplyr (local) eval()
#>  17. ├─dplyr::row_number(x)
#>  18. │ └─vctrs::vec_rank(x, ties = "sequential", incomplete = "na")
#>  19. └─rlang:::stop_internal_c_lib(...)
#>  20.   └─rlang::abort(message, call = call, .internal = TRUE, .frame = frame)

Created on 2023-09-28 with reprex v2.0.2

That's close, but not quite. I need to be able to take your reprex and run it on my computer, and I can't do that without your prepData file.

But rather than providing that file, the best way for you to help me is to instead manually create a subset of alldata that still reproduces the problem (i.e. by using data.frame() and manually providing the values). Or maybe you should manually create X or Y if that is easier

So i need something like

library(mixOmics)

X <- list(
# manual work here to make this that i can also run
)

Y <- # manual work here to make this that i can also run

## matrix design
design <- matrix(0.3, ncol = length(X), nrow = length(X), 
                 dimnames = list(names(X), names(X)))
diag(design) <- 0; design 

## run the model
tune.diablo <- block.plsda(X, Y, ncomp = 5, design = design)
perf.diablo <- perf(tune.diablo, validation = 'Mfold', folds = 5, nrepeat = 10)

I found the issue while I was preparing a small dataset to put here.
If I dont call the package epiDisplay, then no more error message.
I dont know why but it work now.
Again thanks for your time.
Best,
Simon