brentp / peddy

genotype :: ped correspondence check, ancestry check, sex check. directly, quickly on VCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

samples in vcf not in ped

Giuseppe1995 opened this issue · comments

Hi,
I'm trying to run peddy on a set of 339 samples whose variants have been jointly called.
first, from the VCF, I used plink to convert it (let's call it $VCF) to ped/map format using plink (generating a respective $PED file).
very simply, I subsequently run the command specified here:

peddy --plot -p 4 --prefix mystudy $VCF $PED

but, although I receive no errors, it gives me back the following warning:

"WARNING 339 samples in vcf not in ped"

followed by the list of all the samples in the dataset.
I obviously checked if there was any discrepancy between the samples specified in the $VCF and in the $PED, but all the samples seem to match.

Thank you in advance,
Peppe

can you show a few lines of your VCF?
and the ped should be tab-delimited and contain 6 columns (family_id, sample_id paternal_id, maternal_id, sex, phenotype) and no genotype information

Hi, thank you for the rapid answer!
The problem was indeed the PED file, as I utilized the "raw" PED file resulting from the plink conversion of the VCF (plink --vcf, for the record).

Just for completeness, here follows the code I used to produce a six-column tab-delimited ped file described by you:
let $NEWPED be the variable that indicates the new tab-delimited ped file:
cat $PED | tr ' ' '\t' | cut -f 1-6 > $NEWPED.
I wish I'm not inappropriate by copying my code here, but I just want to help someone that might face the same problem.

Thank you again!

great! thanks for posting the code. that will help others who may find this in the future.