AdmiralenOla / Scoary

Pan-genome wide association studies

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gene_presence_absence.csv file from Roary

haruosuz opened this issue · comments

Dear developers,

I wonder if the input gene_presence_absence.csv file for Scoary should contain binary values (1 and 0) rather than Gene ID?

The example of the input file (https://raw.githubusercontent.com/AdmiralenOla/Scoary/master/scoary/exampledata/Gene_presence_absence.csv) contains binary values (1 and 0) indicating the presence and absence of each gene in each sample, like the gene_presence_absence.Rtab file with binary values (1 and 0) from Roary, rather than the gene_presence_absence.csv file with the Gene ID from Roary (https://github.com/haruosuz/mgsa/blob/master/roary/analysis/i95/gene_presence_absence.csv).

Dear @haruosuz,

It doesn't really matter. Scoary can accept the binary form or the gene ID form. In both cases, empty cell values (""), "-" or "0" will be counted as absence and everything else will counted as presence.