bjmt / universalmotif

Motif manipulation functions for R.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error with read_meme

RdeBiotec opened this issue · comments

Hi!

First of all, congratulations for the package. It is really useful.

After run_meme (MEME v. 5.05, 2400 sequences, 7000 background sequences, and trying with either objfun de and se), I always get the same error:

Error in universalmotif_cpp(name = x, type = "PPM", altname = x2, nsites = y[1], :
'bkg' vector is too short

MEME runs fine, the issue seems to be with read_meme. Any ideas? Is there something I may be doing wrong?

EDIT: used run_meme with classic mode, only sequences with no control, and get the same mistake.
The devel version also gives the same error.

I attach sessioninfo():

R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
 [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C               LC_TIME=es_ES.UTF-8        LC_COLLATE=es_ES.UTF-8     LC_MONETARY=es_ES.UTF-8   
 [6] LC_MESSAGES=es_ES.UTF-8    LC_PAPER=es_ES.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] universalmotif_1.3.65 GrpString_0.3.2       Biostrings_2.53.2     XVector_0.25.0        IRanges_2.19.16       S4Vectors_0.23.25     BiocGenerics_0.31.6  

loaded via a namespace (and not attached):
 [1] treeio_1.9.3       gtools_3.8.1       tidyselect_0.2.5   purrr_0.3.2        lattice_0.20-38    colorspace_1.4-1   vctrs_0.2.0        htmltools_0.4.0   
 [9] yaml_2.2.0         rlang_0.4.0        later_1.0.0        pillar_1.4.2       glue_1.3.1         lifecycle_0.1.0    rvcheck_0.1.5      plyr_1.8.4        
[17] stringr_1.4.0      ggseqlogo_0.1      zlibbioc_1.31.0    munsell_0.5.0      gtable_0.3.0       ps_1.3.0           httpuv_1.5.2       gbRd_0.4-11       
[25] crosstalk_1.0.0    Rcpp_1.0.2         xtable_1.8-4       promises_1.1.0     scales_1.0.0       backports_1.1.5    BiocManager_1.30.7 jsonlite_1.6      
[33] mime_0.7           ggplot2_3.2.1      digest_0.6.21      stringi_1.4.3      processx_3.4.1     dplyr_0.8.3        shiny_1.3.2        grid_3.6.1        
[41] bibtex_0.4.2       ggtree_1.99.1      Rdpack_0.11-0      tools_3.6.1        magrittr_1.5       lazyeval_0.2.2     tibble_2.1.3       crayon_1.3.4      
[49] ape_5.3            tidyr_1.0.0        pkgconfig_2.0.3    zeallot_0.1.0      tidytree_0.2.8     assertthat_0.2.1   rstudioapi_0.10    R6_2.4.0          
[57] nlme_3.1-141       compiler_3.6.1 

Thanks a lot :)

Hi,

Thanks for trying out the package. I'm glad you find it useful.

Unfortunately I am failing to reproduce your error, though I don't see anything wrong with what you've done. Are your sequences using a custom alphabet? If so read_meme() could be having trouble parsing the motif file.

If you cut down your fasta file to a few hundred sequences, do you still encounter this error? If so, would you be able to upload this sample file for me to test? (Or email it to me directly.)

Alternatively: if you keep the output (via the output argument), can you still trigger the error by using read_meme() on the output motif file? If so, you could send this instead.

Thanks.

Thanks!

Looks like read_meme() was having trouble parsing the longer amino acid alphabet. I've pushed a patch (v1.3.66) which should fix the issue. Let me know if it works for you.

Thanks a lot, bjmt!

I will check and come back to you!

Closing. Feel free to reopen if there's still a problem.