biodavidjm / artMS

Analytical R Tools for Mass Spectrometry

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error in dataProcess(raw = dmss, normalization = FALSE, fillIncompleteRows = TRUE

jfertaj opened this issue · comments

Hi,

I am trying to run artMS in 122 samples but I got an error when running relative quantification

*** Subject : G2-14, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-14, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-15, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-15, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-16, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
Error in dataProcess(raw = dmss, normalization = FALSE, fillIncompleteRows = TRUE,  : 
  Please remove duplicate rows in the list above.
In addition: Warning messages:
1: ggrepel: 47 unlabeled data points (too many overlaps). Consider increasing max.overlaps 
2: In RColorBrewer::brewer.pal(n, pal) :
  n too large, allowed maximum for palette Set1 is 9
Returning the palette you asked for with that many colors
3: In RColorBrewer::brewer.pal(n, pal) :
  n too large, allowed maximum for palette Set1 is 9
Returning the palette you asked for with that many colors

I don't know in which file should I remove the duplicate rows

my keys.txt is the following:

Raw.file	IsotopeLabelType	Condition	BioReplicate	Run
1-19-2021_1_09-079_G1_601	L	G0	G0-1	1
1-19-2021_1R_09-079_G1_605	L	G0	G0-1	2
1-29-2021_33_13-038_G0_881	L	G0	G0-2	3
1-29-2021_33R_13-038_G0_885	L	G0	G0-2	4
1-30-2021_37_13-068_G0_913	L	G0	G0-3	5
1-31-2021_37R_13-068_G0_917	L	G0	G0-3	6
2-4-2021_51_16-042_G0_1045	L	G0	G0-4	7
2-5-2021_51R_16-042_G0_1049	L	G0	G0-4	8
1-21-2021_9_10-072_G0_666	L	G0	G0-5	9
1-21-2021_9R_10-072_G0_670	L	G0	G0-5	10
1-23-2021_13_10-098_G1_708	L	G1	G1-1	11
1-23-2021_13R_10-098_G1_712	L	G1	G1-1	12
1-23-2021_15_10-106_G1_724	L	G1	G1-2	13
1-23-2021_15R_10-106_G1_728	L	G1	G1-2	14
1-24-2021_17_10-123_G1_740	L	G1	G1-3	15
1-24-2021_17R_10-123_G1_744	L	G1	G1-3	16
1-24-2021_18_10-125_G1_748	L	G1	G1-4	17
1-25-2021_18R_10-125_G1_753	L	G1	G1-4	18
1-25-2021_19_10-133_G1_757	L	G1	G1-5	19
1-25-2021_19R_10-133_G1_761	L	G1	G1-5	20
1-26-2021_20_10-141_G1_765	L	G1	G1-6	21
1-26-2021_20R_B1_10-141_G1_770	L	G1	G1-6	22
1-26-2021_22_11-003_G1_783	L	G1	G1-7	23
1-26-2021_22R_11-003_G1_787	L	G1	G1-7	24
1-27-2021_26_11-027_G1_815	L	G1	G1-8	25
1-27-2021_26R_11-027_G1_819	L	G1	G1-8	26
1-28-2021_28_11-057_G1_831	L	G1	G1-9	27
1-28-2021_28R_11-057_G1_835	L	G1	G1-9	28
1-28-2021_29_12-003_G1_839	L	G1	G1-10	29
1-28-2021_29R_12-003_G1_843	L	G1	G1-10	30
1-31-2021_38_14-024_G1_921	L	G1	G1-11	31
1-31-2021_38R_14-024_G1_925	L	G1	G1-11	32
1-31-2021_39_14-046_G1_929	L	G1	G1-12	33
1-31-2021_39R_14-046_G1_933	L	G1	G1-12	34
1-20-2021_4_10-046_G1_625	L	G1	G1-13	35
2-1-2021_41_14-065_G1_947	L	G1	G1-14	36
2-1-2021_41R_14-065_G1_951	L	G1	G1-14	37
2-1-2021_42_14-081_G1_955	L	G1	G1-15	38
2-1-2021_42R_14-081_G1_959	L	G1	G1-15	39
2-1-2021_44_15-003_G1_971	L	G1	G1-16	40
2-1-2021_44R_15-003_G1_975	L	G1	G1-16	41
2-2-2021_45_15-008_G1_979	L	G1	G1-17	42
2-2-2021_45R_15-008_G1_983	L	G1	G1-17	43
2-3-2021_46_15-010_G1_1005	L	G1	G1-18	44
2-3-2021_46R_15-010_G1_1009	L	G1	G1-18	45
2-4-2021_48_15-025_G1_1021	L	G1	G1-19	46
2-4-2021_48R_15-025_G1_1025	L	G1	G1-19	47
1-20-2021_4R_10-046_G1_629	L	G1	G1-13	48
1-20-2021_5_10-048_G1_633	L	G1	G1-20	49
2-4-2021_50_16-029_G1_1037	L	G1	G1-21	50
2-4-2021_50R_16-029_G1_1041	L	G1	G1-21	51
1-20-2021_5R_10-048_G1_637	L	G1	G1-20	52
1-20-2021_6_10-052_G1_641	L	G1	G1-22	53
1-20-2021_6R_10-052_G1_645	L	G1	G1-22	54
1-21-2021_7_10-058_G1_649	L	G1	G1-23	55
1-21-2021_7R_10-058_G1_653	L	G1	G1-23	56
1-21-2021_8_10-061_G1_657	L	G1	G1-24	57
1-21-2021_8R_10-061_G1_661	L	G1	G1-24	58
2-6-2021_57_2012-019_G1.5_1093	L	G1	G1-25	59
2-6-2021_57R_2012-019_G1.5_1097	L	G1	G1-25	60
1-21-2021_10_10-079_G2_674	L	G2	G2-1	61
1-22-2021_10R_10-079_G2_678	L	G2	G2-1	62
1-23-2021_16_10-113_G2_732	L	G2	G2-2	63
1-23-2021_16R_10-113_G2_736	L	G2	G2-2	64
1-19-2021_2_09-088_G2_609	L	G2	G2-3	65
1-26-2021_21_10-144_G2_775	L	G2	G2-4	66
1-26-2021_21R_10-144_G2_779	L	G2	G2-4	67
1-26-2021_23_11-008_G2_791	L	G2	G2-5	68
1-27-2021_23R_11-008_G2_795	L	G2	G2-5	69
1-27-2021_24_11-009_G2_799	L	G2	G2-6	70
1-27-2021_24R_11-009_G2_803	L	G2	G2-6	71
1-27-2021_25_11-019_20_G2_807	L	G2	G2-7	72
1-27-2021_25R_11-019_20_G2_811	L	G2	G2-7	73
1-19-2021_2R_09-088_G2_613	L	G2	G2-3	74
1-20-2021_3_10-042_G2_617	L	G2	G2-4	75
1-28-2021_30_12-005_G2_847	L	G2	G2-5	76
1-28-2021_30R_12-005_G2_851	L	G2	G2-5	77
1-29-2021_31_12-007_G2_865	L	G2	G2-6	78
1-29-2021_31R_12-007_G2_869	L	G2	G2-6	79
1-29-2021_32_13-035_G2_873	L	G2	G2-7	80
1-29-2021_32R_13-035_G2_877	L	G2	G2-7	81
1-30-2021_34_13-051_G2_889	L	G2	G2-8	82
1-30-2021_34R_13-051_G2_893	L	G2	G2-8	83
1-30-2021_35_13-055_G2_897	L	G2	G2-9	84
1-30-2021_35R_13-055_G2_901	L	G2	G2-9	85
1-30-2021_36_13-060_G2_905	L	G2	G2-10	86
1-30-2021_36R_13-060_G2_909	L	G2	G2-10	87
1-20-2021_3R_10-042_G2_621	L	G2	G2-4	88
2-4-2021_47R_15-017_G2_1017	L	G2	G2-11	89
2-3-2021_47_15-017_G2_1013	L	G2	G2-11	90
2-4-2021_49_15-027_G2_1029	L	G2	G2-12	91
2-4-2021_49R_15-027_G2_1033	L	G2	G2-12	92
2-5-2021_52_16-044_G2_1053	L	G2	G2-13	93
2-5-2021_52R_16-044_G2_1057	L	G2	G2-13	94
2-5-2021_53_2011-042_G2_1061	L	G2	G2-14	95
2-5-2021_53R_2011-042_G2_1065	L	G2	G2-14	96
2-6-2021_55_2011-068_G2_1077	L	G2	G2-15	97
2-6-2021_55R_2011-068_G2_1081	L	G2	G2-15	98
2-7-2021_61_2012-066_G2_1125	L	G2	G2-16	99
2-7-2021_61R_2012-066_G2_1129	L	G2	G2-16	100
1-22-2021_11_10-086_G3_692	L	G3	G3-1	101
1-22-2021_11R_10-086_G3_696	L	G3	G3-1	102
2-15-2021_14_10-103_G3_1155	L	G3	G3-2	103
2-15-2021_14R_10-103_G3_1159	L	G3	G3-2	104
1-31-2021_40_14-047_G3_937	L	G3	G3-3	105
1-31-2021_40R_14-047_G3_941	L	G3	G3-3	106
2-1-2021_43_15-002_G3_963	L	G3	G3-4	107
2-1-2021_43R_15-002_G3_967	L	G3	G3-4	108
2-6-2021_56_2012-014_G3_1085	L	G3	G3-5	109
2-6-2021_56R_2012-014_G3_1089	L	G3	G3-5	110
2-7-2021_59_2012-029_G3_1109	L	G3	G3-6	111
2-7-2021_59R_2012-029_G3_1113	L	G3	G3-6	112
2-7-2021_60_2012-034_G3_1117	L	G3	G3-7	113
2-7-2021_60R_2012-034_G3_1121	L	G3	G3-7	114
1-22-2021_12_10-094_G4_700	L	G4	G4-1	115
1-22-2021_12R_10-094_G4_704	L	G4	G4-1	116
1-27-2021_27_11-049_G4_823	L	G4	G4-2	117
1-28-2021_27R_11-049_G4_827	L	G4	G4-2	118
2-5-2021_54_2011-048_G4_1069	L	G4	G4-3	119
2-5-2021_54R_2011-048_G4_1073	L	G4	G4-3	120
2-6-2021_58_2012-021_G4_1101	L	G4	G4-4	121
2-7-2021_58R_2012-021_G4_1105	L	G4	G4-5	122

Any help is much appreciated
Thanks

Juan

Hi Juan,

Assuming that you posted the full list of duplicated rows, i.e.

*** Subject : G2-14, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-14, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-15, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-15, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-16, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)

I would do the following test: remove those biological replicates, i.e. G14, G-15, G-16 from the keys.txt file and try again. That is, you should re-run it with this new keys.txt file:

Raw.file	IsotopeLabelType	Condition	BioReplicate	Run
1-19-2021_1_09-079_G1_601	L	G0	G0-1	1
1-19-2021_1R_09-079_G1_605	L	G0	G0-1	2
1-29-2021_33_13-038_G0_881	L	G0	G0-2	3
1-29-2021_33R_13-038_G0_885	L	G0	G0-2	4
1-30-2021_37_13-068_G0_913	L	G0	G0-3	5
1-31-2021_37R_13-068_G0_917	L	G0	G0-3	6
2-4-2021_51_16-042_G0_1045	L	G0	G0-4	7
2-5-2021_51R_16-042_G0_1049	L	G0	G0-4	8
1-21-2021_9_10-072_G0_666	L	G0	G0-5	9
1-21-2021_9R_10-072_G0_670	L	G0	G0-5	10
1-23-2021_13_10-098_G1_708	L	G1	G1-1	11
1-23-2021_13R_10-098_G1_712	L	G1	G1-1	12
1-23-2021_15_10-106_G1_724	L	G1	G1-2	13
1-23-2021_15R_10-106_G1_728	L	G1	G1-2	14
1-24-2021_17_10-123_G1_740	L	G1	G1-3	15
1-24-2021_17R_10-123_G1_744	L	G1	G1-3	16
1-24-2021_18_10-125_G1_748	L	G1	G1-4	17
1-25-2021_18R_10-125_G1_753	L	G1	G1-4	18
1-25-2021_19_10-133_G1_757	L	G1	G1-5	19
1-25-2021_19R_10-133_G1_761	L	G1	G1-5	20
1-26-2021_20_10-141_G1_765	L	G1	G1-6	21
1-26-2021_20R_B1_10-141_G1_770	L	G1	G1-6	22
1-26-2021_22_11-003_G1_783	L	G1	G1-7	23
1-26-2021_22R_11-003_G1_787	L	G1	G1-7	24
1-27-2021_26_11-027_G1_815	L	G1	G1-8	25
1-27-2021_26R_11-027_G1_819	L	G1	G1-8	26
1-28-2021_28_11-057_G1_831	L	G1	G1-9	27
1-28-2021_28R_11-057_G1_835	L	G1	G1-9	28
1-28-2021_29_12-003_G1_839	L	G1	G1-10	29
1-28-2021_29R_12-003_G1_843	L	G1	G1-10	30
1-31-2021_38_14-024_G1_921	L	G1	G1-11	31
1-31-2021_38R_14-024_G1_925	L	G1	G1-11	32
1-31-2021_39_14-046_G1_929	L	G1	G1-12	33
1-31-2021_39R_14-046_G1_933	L	G1	G1-12	34
1-20-2021_4_10-046_G1_625	L	G1	G1-13	35
2-1-2021_41_14-065_G1_947	L	G1	G1-14	36
2-1-2021_41R_14-065_G1_951	L	G1	G1-14	37
2-1-2021_42_14-081_G1_955	L	G1	G1-15	38
2-1-2021_42R_14-081_G1_959	L	G1	G1-15	39
2-1-2021_44_15-003_G1_971	L	G1	G1-16	40
2-1-2021_44R_15-003_G1_975	L	G1	G1-16	41
2-2-2021_45_15-008_G1_979	L	G1	G1-17	42
2-2-2021_45R_15-008_G1_983	L	G1	G1-17	43
2-3-2021_46_15-010_G1_1005	L	G1	G1-18	44
2-3-2021_46R_15-010_G1_1009	L	G1	G1-18	45
2-4-2021_48_15-025_G1_1021	L	G1	G1-19	46
2-4-2021_48R_15-025_G1_1025	L	G1	G1-19	47
1-20-2021_4R_10-046_G1_629	L	G1	G1-13	48
1-20-2021_5_10-048_G1_633	L	G1	G1-20	49
2-4-2021_50_16-029_G1_1037	L	G1	G1-21	50
2-4-2021_50R_16-029_G1_1041	L	G1	G1-21	51
1-20-2021_5R_10-048_G1_637	L	G1	G1-20	52
1-20-2021_6_10-052_G1_641	L	G1	G1-22	53
1-20-2021_6R_10-052_G1_645	L	G1	G1-22	54
1-21-2021_7_10-058_G1_649	L	G1	G1-23	55
1-21-2021_7R_10-058_G1_653	L	G1	G1-23	56
1-21-2021_8_10-061_G1_657	L	G1	G1-24	57
1-21-2021_8R_10-061_G1_661	L	G1	G1-24	58
2-6-2021_57_2012-019_G1.5_1093	L	G1	G1-25	59
2-6-2021_57R_2012-019_G1.5_1097	L	G1	G1-25	60
1-21-2021_10_10-079_G2_674	L	G2	G2-1	61
1-22-2021_10R_10-079_G2_678	L	G2	G2-1	62
1-23-2021_16_10-113_G2_732	L	G2	G2-2	63
1-23-2021_16R_10-113_G2_736	L	G2	G2-2	64
1-19-2021_2_09-088_G2_609	L	G2	G2-3	65
1-26-2021_21_10-144_G2_775	L	G2	G2-4	66
1-26-2021_21R_10-144_G2_779	L	G2	G2-4	67
1-26-2021_23_11-008_G2_791	L	G2	G2-5	68
1-27-2021_23R_11-008_G2_795	L	G2	G2-5	69
1-27-2021_24_11-009_G2_799	L	G2	G2-6	70
1-27-2021_24R_11-009_G2_803	L	G2	G2-6	71
1-27-2021_25_11-019_20_G2_807	L	G2	G2-7	72
1-27-2021_25R_11-019_20_G2_811	L	G2	G2-7	73
1-19-2021_2R_09-088_G2_613	L	G2	G2-3	74
1-20-2021_3_10-042_G2_617	L	G2	G2-4	75
1-28-2021_30_12-005_G2_847	L	G2	G2-5	76
1-28-2021_30R_12-005_G2_851	L	G2	G2-5	77
1-29-2021_31_12-007_G2_865	L	G2	G2-6	78
1-29-2021_31R_12-007_G2_869	L	G2	G2-6	79
1-29-2021_32_13-035_G2_873	L	G2	G2-7	80
1-29-2021_32R_13-035_G2_877	L	G2	G2-7	81
1-30-2021_34_13-051_G2_889	L	G2	G2-8	82
1-30-2021_34R_13-051_G2_893	L	G2	G2-8	83
1-30-2021_35_13-055_G2_897	L	G2	G2-9	84
1-30-2021_35R_13-055_G2_901	L	G2	G2-9	85
1-30-2021_36_13-060_G2_905	L	G2	G2-10	86
1-30-2021_36R_13-060_G2_909	L	G2	G2-10	87
1-20-2021_3R_10-042_G2_621	L	G2	G2-4	88
2-4-2021_47R_15-017_G2_1017	L	G2	G2-11	89
2-3-2021_47_15-017_G2_1013	L	G2	G2-11	90
2-4-2021_49_15-027_G2_1029	L	G2	G2-12	91
2-4-2021_49R_15-027_G2_1033	L	G2	G2-12	92
2-5-2021_52_16-044_G2_1053	L	G2	G2-13	93
2-5-2021_52R_16-044_G2_1057	L	G2	G2-13	94
1-22-2021_11_10-086_G3_692	L	G3	G3-1	101
1-22-2021_11R_10-086_G3_696	L	G3	G3-1	102
2-15-2021_14_10-103_G3_1155	L	G3	G3-2	103
2-15-2021_14R_10-103_G3_1159	L	G3	G3-2	104
1-31-2021_40_14-047_G3_937	L	G3	G3-3	105
1-31-2021_40R_14-047_G3_941	L	G3	G3-3	106
2-1-2021_43_15-002_G3_963	L	G3	G3-4	107
2-1-2021_43R_15-002_G3_967	L	G3	G3-4	108
2-6-2021_56_2012-014_G3_1085	L	G3	G3-5	109
2-6-2021_56R_2012-014_G3_1089	L	G3	G3-5	110
2-7-2021_59_2012-029_G3_1109	L	G3	G3-6	111
2-7-2021_59R_2012-029_G3_1113	L	G3	G3-6	112
2-7-2021_60_2012-034_G3_1117	L	G3	G3-7	113
2-7-2021_60R_2012-034_G3_1121	L	G3	G3-7	114
1-22-2021_12_10-094_G4_700	L	G4	G4-1	115
1-22-2021_12R_10-094_G4_704	L	G4	G4-1	116
1-27-2021_27_11-049_G4_823	L	G4	G4-2	117
1-28-2021_27R_11-049_G4_827	L	G4	G4-2	118
2-5-2021_54_2011-048_G4_1069	L	G4	G4-3	119
2-5-2021_54R_2011-048_G4_1073	L	G4	G4-3	120
2-6-2021_58_2012-021_G4_1101	L	G4	G4-4	121
2-7-2021_58R_2012-021_G4_1105	L	G4	G4-5	122

Please, let us know how that goes, and if it fails, please, post the full output from artMS when running the command and we'll try to keep debugging the issue.

Thanks!

Hi David,

Actually I didn't paste the whole output only a part. The full output mention that all my samples have duplicate rows,

Here is the full output

> artmsQuantification(
+   yaml_config_file = '~/Dropbox/INIBIC/Patri/patri.yaml',
+   display_msstats = TRUE)
---------------------------------------------------
artMS: BASIC QUALITY CONTROL (evidence.txt based)
---------------------------------------------------
>> MERGING FILES 
-- Plot: correlation matrices
---- by Technical replicates 
---- by Biological replicates 
---- by Conditions 
-- Plot: intensity stats
---- AB PROCESSED 
<< Basic quality control analysis completed!
---------------------------------------------
artMS: EXTENDED QUALITY CONTROL (-evidence.txt based)
---------------------------------------------
>> MERGING FILES 
>> GENERATING QC PLOTS 
--- Plot PSM done 
--- Plot IONS done 
--- Plot TYPE done 
--- Plot PEPTIDES done 
--- Plot PEPTIDE OVERLAP done 
--- Plot PROTEINS done 
--- Plot PROTEIN OVERLAP done 
--- Plot Plot Ion Oversampling done 
--- Plot Charge State done 
--- Plot Mass Error done 
--- Plot Mass-over-Charge distribution done 
--- Plot Peptide Intensity CV done 
--- Plot Peptide Detection (using modified.sequence) done 
--- Plot Protein Intensity CV done 
--- Plot Protein Detection done 
--- Plot ID overlap done 
--- Plot PCA and Inter-Correlation (WARNING: it might take a long time. Please, be patient)
	(-) Skip peptide-based correlation matrix (too many samples)
	(-) Skip Protein-based correlation matrix (too many samples)
--- Plot Sample Preparation... done
>> QC extended completed
-- No summary-based QC selected
--------------------------------------------
artMS: Relative Quantification using MSstats
--------------------------------------------
>> Reading the configuration file
>> LOADING DATA 
>> MERGING FILES 
>> FILTERING 
-- Contaminants CON__|REV__ removed
-- Removing protein groups
-- Use <Leading.razor.protein> as Protein ID
-- PROCESSING AB
>> CONVERTING THE DATA TO MSSTATS FORMAT 
-- Selecting Sequence Type: MaxQuant 'Modified.sequence' column
-- Adding NA values for missing values (required by MSstats) 
-- Write out the MSstats input file (-mss.txt) 
>> RUNNING MSstats (it usually takes a 'long' time: please, be patient)
-- QC PLOT: before
*** Subject : G0-1, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-5, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-16, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-1, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-1, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-2, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-2, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-3, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-3, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-4, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-4, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-5, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-1, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-5, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-6, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-6, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-7, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G3-7, Condition : G3 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-1, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-1, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-2, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-2, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-3, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-1, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-3, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-4, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G4-5, Condition : G4 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-2, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-2, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-3, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-3, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-4, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-4, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-5, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-1, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-5, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-6, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-6, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-7, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-7, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-8, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-8, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-9, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-9, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-10, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-2, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-10, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-11, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-11, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-12, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-12, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-13, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-14, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-14, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-15, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-15, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-2, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-16, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-16, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-17, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-17, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-18, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-18, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-19, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-19, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-13, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-20, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-3, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-21, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-21, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-20, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-22, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-22, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-23, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-23, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-24, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-24, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-25, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-3, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G1-25, Condition : G1 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-1, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-1, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-2, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-2, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-3, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-4, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-4, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-5, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-5, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-4, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-6, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-6, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-7, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-7, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-3, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-4, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-5, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-5, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-6, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-6, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-4, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-7, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-7, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-8, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-8, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-9, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-9, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-10, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-10, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-4, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-11, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G0-5, Condition : G0 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-11, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-12, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-12, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-13, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-13, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-14, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-14, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-15, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-15, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
*** Subject : G2-16, Condition : G2 has multiple rows (duplicate rows) for some features (_1_NA_NA, _2_NA_NA, _3_NA_NA, _4_NA_NA)
Error in dataProcess(raw = dmss, normalization = FALSE, fillIncompleteRows = TRUE,  : 
  Please remove duplicate rows in the list above.
In addition: Warning messages:
1: ggrepel: 47 unlabeled data points (too many overlaps). Consider increasing max.overlaps 
2: In RColorBrewer::brewer.pal(n, pal) :
  n too large, allowed maximum for palette Set1 is 9
Returning the palette you asked for with that many colors

3: In RColorBrewer::brewer.pal(n, pal) :
  n too large, allowed maximum for palette Set1 is 9
Returning the palette you asked for with that many colors

Thanks a lot
Juan

Thanks Juan

without seeing the evidence file, I suspect that the problem might be the empty Protein IDs that the evidence file sometimes contains. Why don't you delete those from the evidence file: just create a copy of your evidence file and delete the empty ids, and re-run it with this new evidence file.

Please, let us know how it goes

Thanks, that was the problem. I removed the empty Proteins ID and it worked like a charm!!

Excellent! We'll add an extra checkpoint to remove them. Thanks!