why individual 13291 NA07045 in 0.2_low_call_rate_pihat.txt

Question

why individual 13291 NA07045 in 0.2_low_call_rate_pihat.txt

usernicai opened this issue a year ago · comments

May I ask "For each pair of 'related' individuals with a pihat > 0.2, we recommend to remove the individual with the lowest call rate"? What does call rate mean in this sentence, and why do I need to "plink --bfile HapMap_3_r3_11 --missing" after getting the file pihat_min0.2_in_founders.genome？ Of course the crux of the matter is how to get the result by the above operation, individual 13291 NA07045 has the lowest call rate, i.e. how to get the data in 0.2_low_call_rate_pihat.txt?

robertzeibich · Answer 1 · Tue Nov 07 2023 10:30:41 GMT+0800 (China Standard Time)

You should have a look into the pihat_min0.2.genome file to observe conflictions. Once you know the samples thatare highly related, you want to remove the one, which has the higher proportion of missing genotypes. How?
Based on my understanding, samples with a lower proportion of missing genotypes can be considered to have a better call rate. Thus, you need to use plink --bfile HapMap_3_r3_11 --missing again.
NA07045 has a higher proportion of missing genotypes and should therefore be removed.