error
darwinbandoy opened this issue · comments
Hi,
I keep getting this error message when I ran Scoary on my Roary Output:
CRITICAL:
Traceback (most recent call last): File "/miniconda3/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 268, in main strains)
File "miniconda3/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 568, in Csv_to_dic sys.exit("Make sure the top-left cell in the traits file "
SystemExit: Make sure the top-left cell in the traits file is either empty or 'Name'. Do not include empty rows Make sure the top-left cell in the traits file is either empty or 'Name'. Do not include empty rows
This my trait CSV file and there seems to be no error with the traits file.
Name,Abortive,Non_Abortive
B197.11581,0,1
B197.7887,0,1
B197.7889,0,1
B197.789,0,1
B197.7927,0,1
What could be the issue? Thanks
It may include <U+FEFF> before Name. Don't generate csv by Microsoft office. Check by less command.
Hi @darwinbandoy and apologies for answering to this so late. Most probably @sam-mic has the correct answer - You have an invisible U+FEFF byte order mark in your file.
I corrected the <U+FEFF>, thanks.
However I still get this error message
CRITICAL:
Traceback (most recent call last):
File "/Users/miniconda3/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 268, in main
strains)
File "/Users/miniconda3/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 579, in Csv_to_dic
"values (no commas): %s" % str(",".join(allowed_values)))
SystemExit: Unrecognized character found in trait file. Allowed values (no commas): 0,1,NA,.,-, ,
Unrecognized character found in trait file. Allowed values (no commas): 0,1,NA,.,-, ,
@darwinbandoy - Could you e-mail your input files to olbb@fhi.no and I'll have a look?
Thanks, @darwinbandoy
I have had a look at your input files. Here are the two things that let me run Scoary without issues:
- Make sure you are giving Scoary the correct column for where your gene data starts. By default, this is column 15 in Roary files. In your non-standard gene presence/absence file, it is column 2. Therefore, you need to run Scoary with "-s 2"
- There is not a perect overlap between the strains described in your gene presence/absence file and those in your traits file. In your gene presence absence file, you have "BCW_10131", but this isolate is not in your traits file. Further, your traits file has "BCW_10196", but this is not in your gene presence/absence file. When I deleted these respective entries from your files, Scoary was able to run without errors.
Please let me know if this fixes the issue for you too.
also, instead of gene_presence_absence.csv, I have a mutation_presence_absence.csv which i am using as -g input parameter. Is it okay to use this .csv file. please help me with it.
Hi @akanksharawat07. Most likely you just have no significant results with a p-value lower than 0.05. Try running with -p 1.0.
Also, in the future consider opening a new issue rather than posting on a closed, unrelated issue :-)