AdmiralenOla / Scoary

Pan-genome wide association studies

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

genetic differences among populations defined by population analysis

kopelol opened this issue · comments

Hi,
I'd like to explore the population-specific genes.
I defined 6 populations using population analysis method.

Scoary take into account the population structure, but in my case the dataset had already been divided into 6 populations using population analysis.
So should I skipped this step?(If so, could you please teach me how to?) or Should I specify the newick file I created from core-gene alignment obtained by roary?

Best regards,

In addition, I tried to use "--no_pairwise", but the results was same before using this option.

Hi @kopelol .

In this case you should just use your 6 already specified populations from your previous analysis. Note that you will have to create 6 individual phenotype variables and then have a binary membership. Example:

,Pop1,Pop2,Pop3,Pop4,Pop5,Pop6
Strain1,1,0,0,0,0,0
Strain2,0,0,0,1,0,0
etc.

Using "--no_pairwise" should make your analysis run quicker since you are noe performing any pairwise comparisons, that is, you are not correcting for population structure at all. In these analyses it doesn't actually matter if you use a pre-defined newick tree or let this be handled internally by Scoary.