Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated
jfertaj opened this issue · comments
Dear David,
I have installed the new version of artMS that includes some nice features. However, I am having some issues when running analyses that were successful run with artMS 1.9.4.
I got an error during Msstats step after handling the fractions (no fractions enabled in my experiments).
Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__, :
Join results in 7002761 rows; more than 584171 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice.
>
When I run the same files with artMS 1.9.4 the analyses ends perfectly.
This is my yaml configuration file
files:
evidence: evidence_LS.txt
keys: keys_LS.txt
contrasts: contrasts_LS.txt
summary: summary_LS.txt
output: results_LS/results_LS.txt
qc:
basic: 0
extended: 0
extendedSummary: 0
data:
enabled: 1
silac:
enabled: 0
filters:
enabled: 1
contaminants: 1
protein_groups: keep
modifications: AB
sample_plots: 1
msstats:
enabled: 1
msstats_input: ~
profilePlots: none
normalization_method: equalizeMedians
normalization_reference: ~
summaryMethod: TMP
MBimpute: 1
censoredInt: NA
feature_subset: all
n_top_feature: 3
logTrans: 2
remove_uninformative_feature_outlier: no
min_feature_count: 2
equalFeatureVar: yes
remove50missing: no
fix_missing: ~
maxQuantileforCensored: 0.999
use_log_file: no
append: no
log_file_path: ~
output_extras:
enabled: 1
annotate:
enabled: 1
species: HUMAN
plots:
volcano: 1
heatmap: 1
LFC: -0.58 0.58
FDR: 0.05
heatmap_cluster_cols: 0
heatmap_display: log2FC
Any help would be appreciated
Thanks
Juan
Hi Juan, thanks for reporting this.
we would need a little bit more information to debug this issue.
- Could you please re-run the
artmsQuantification()
function activating the parameterdisplay_msstats = TRUE
and provide the full output message display in the console? - Could you also please copy and paste the content of
artms_sessionInfo_quantification.log
?
Thanks!
Hi I meet the same question
my worng is
artMS: Relative Quantification using MSstats
Reading the configuration file
LOADING DATA
MERGING FILES
CONVERT Intensity values < 1 to NA
FILTERING
-- Contaminants CON__|REV__ removed
-- Removing protein groups
-- Use <Leading.razor.protein> as Protein ID
-- PROCESSING AB
CONVERTING THE DATA TO MSSTATS FORMAT
-- Selecting Sequence Type: MaxQuant 'Modified.sequence' column
(+) column added (with value 1, MSstats requirement)
-- Adding NA values for missing values (required by MSstats)
-- Write out the MSstats input file (-mss.txt)
RUNNING MSstats (it usually takes a 'long' time: please, be patient)
-- Normalization method: equalizeMedians
INFO [2021-08-10 00:23:44] ** Features with one or two measurements across runs are removed.
INFO [2021-08-10 00:23:44] ** Fractionation handled.
Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__, :
Join results in 37881 rows; more than 4218 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice.
Thanks!
Thanks,
To debug the issue, it is also needed the following information:
- Please, run the following commands and provide the outputs:
# R version
version
# artMS version
packageVersion("artMS")
- Could you also please copy and paste the content of
artms_sessionInfo_quantification.log
?
Thanks
version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 1.0
year 2021
month 05
day 18
svn rev 80317
language R
version.string R version 4.1.0 (2021-05-18)
nickname Camp PontanezenartMS version
packageVersion("artMS")
[1] ‘1.10.2’
and the log is
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 LC_CTYPE=Chinese (Simplified)_China.936 LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C LC_TIME=Chinese (Simplified)_China.936
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] artMS_1.10.2
loaded via a namespace (and not attached):
[1] nlme_3.1-152 bitops_1.0-7 bit64_4.0.5 RColorBrewer_1.1-2 httr_1.4.2
[6] GenomeInfoDb_1.28.1 UpSetR_1.4.0 tools_4.1.0 backports_1.2.1 utf8_1.2.2
[11] R6_2.5.0 KernSmooth_2.23-20 lazyeval_0.2.2 DBI_1.1.1 BiocGenerics_0.38.0
[16] colorspace_2.0-2 ade4_1.7-17 tidyselect_1.1.1 gridExtra_2.3 bit_4.0.4
[21] compiler_4.1.0 VennDiagram_1.6.20 preprocessCore_1.54.0 Biobase_2.52.0 formatR_1.11
[26] plotly_4.9.4.1 ggdendro_0.1.22 caTools_1.18.2 scales_1.1.1 checkmate_2.0.0
[31] stringr_1.4.0 digest_0.6.27 minqa_1.2.4 XVector_0.32.0 pkgconfig_2.0.3
[36] htmltools_0.5.1.1 lme4_1.1-27.1 fastmap_1.1.0 limma_3.48.1 htmlwidgets_1.5.3
[41] rlang_0.4.11 GlobalOptions_0.1.2 RSQLite_2.2.7 shape_1.4.6 generics_0.1.0
[46] jsonlite_1.7.2 gtools_3.9.2 dplyr_1.0.7 zip_2.2.0 RCurl_1.98-1.3
[51] magrittr_2.0.1 GenomeInfoDbData_1.2.6 futile.logger_1.4.3 Matrix_1.3-3 Rcpp_1.0.7
[56] munsell_0.5.0 S4Vectors_0.30.0 fansi_0.5.0 lifecycle_1.0.0 yaml_2.2.1
[61] stringi_1.7.3 MASS_7.3-54 zlibbioc_1.38.0 org.Hs.eg.db_3.13.0 gplots_3.1.1
[66] plyr_1.8.6 grid_4.1.0 blob_1.2.2 parallel_4.1.0 MSstatsConvert_1.2.2
[71] ggrepel_0.9.1 crayon_1.4.1 MSstats_4.0.1 lattice_0.20-44 Biostrings_2.60.2
[76] splines_4.1.0 circlize_0.4.13 KEGGREST_1.32.0 pillar_1.6.2 boot_1.3-28
[81] log4r_0.3.2 seqinr_4.2-8 marray_1.70.0 stats4_4.1.0 futile.options_1.0.1
[86] glue_1.4.2 lambda.r_1.2.4 data.table_1.14.0 png_0.1-7 vctrs_0.3.8
[91] nloptr_1.2.2.2 tidyr_1.1.3 gtable_0.3.0 getopt_1.20.3 purrr_0.3.4
[96] cachem_1.0.5 ggplot2_3.3.5 openxlsx_4.2.4 viridisLite_0.4.0 survival_3.2-11
[101] tibble_3.1.3 pheatmap_1.0.12 AnnotationDbi_1.54.1 memoise_2.0.0 IRanges_2.26.0
[106] corrplot_0.90 cluster_2.1.2 ellipsis_0.3.2
Thanks!
You are using the right version. The issue might be the keys.txt file. Could you please copy and paste here the content of the keys file? Alternatively, you could send it by email to artms.help@gmail.com
My key file is
Raw.file Condition BioReplicate Run IsotopeLabelType
A1.raw a a_1 1 L
A2.raw a a_2 2 L
A3.raw a a_3 3 L
B_1.raw b b_1 1 L
B_2.raw b b_2 2 L
B_3.raw b b_3 3 L
C_1.raw c c_1 1 L
C_2.raw c c_2 2 L
C_3.raw c c_3 3 L
Ok, we got it,
the problem is your keys. Please, check the documentation to find out more about it Content > Input files > keys.txt
- Condition: The conditions names must follow these rules:
- Use only letters (A - Z, both uppercase and
lowercase) and numbers (0 - 9). The only special character allowed
is underscore (_
). - Very important: A condition name cannot begin with a number
(R limitation).
- Use only letters (A - Z, both uppercase and
- BioReplicate: biological replicate number. It is based on the condition
name. Use as prefix the correspondingCondition
name, and add as suffix
dash (-)
plus the biological replicate number.
For example, if conditionH1N1_06H
has too biological replicates,
name themH1N1_06H-1
andH1N1_06H-2
i.e., you are using _
instead of -
in the BioReplicate
column. Change that (a-1
instead a_1
, etc), re-run artmsQuantification.
We definitely need to add a function to check for this to make sure it stops the analysis if the. We'll do it in the next version of artMS.
Thanks
I replace the a_1 to a-1, But I met the same wrong
By the way. I using the MSstats run the same file, and I finish it. I did not meet any wrong
Ok, I forgot to mention to make the "Run" column from 1 to 9 and please, try again.
I finished it! Thank you !
Hi,
I want to analysis the Methylation in my data. So I set the user defined PTM
in my config file, I wrote:
data:
enabled: 1
silac:
enabled: 0
filters:
enabled: 1
contaminants: 1
protein_groups: remove
modifications: PTM:KR:methyl
But I met the trouble:
Error in .artms_filterData(x = x, config = config, verbose = verbose) :
The config > data > filters > modification PTM:KR:METHYL is not valid option
Glad to hear that the issue was solved.
With respect to the other question, could you please start a new github issue?
Thanks for your patience helping me!
Hi David,
Sorry for open again this issue. I have run my data using a the example time course experiment template in MSstats manual and it run without any warnings, I don't know if the issue could be that my data is a time course experiment with same sample measured in two different times and it caused artMS to failed.
I don't know how to translate the annotation file required in MSstast to keys file for artMS but I attached here the file in case you want to have a look
Thanks
Juan
annotation2.txt
Hi Juan,
It looks like you have 6 different conditions (Time1_N, Time1_P,
etc), with 15 bioreplicates each? (Sample_10N, Sample_11N
, etc). Is this correct?
if it is the case, you are not following the naming rules explained above and in the documentation.
This would be very easy to solve, i.e., you should call your bioreplicates Time1_N-1, Time1_N-2, Time1_N-3,... Time1_N-15
etc.