brentp / peddy

genotype :: ped correspondence check, ancestry check, sex check. directly, quickly on VCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

process was killed, Memory Error.

wym0072003 opened this issue · comments

Thank you for providing this great tool.
I have a problem, my process was killed after running half an hour. From traceback, the last line displayed 'Memory Error', my desktop with 128 GB ram.
I have ~4,000 WES samples with a 62 GB vcf file and a 67 GB ped file.
Would you please give me some suggestions to successfully run the Peddy.
Thank you very much!

Yiming

Hi, your ped file should be only a few KB, it should be just 6 columns: family_id, sample_id, mom_id, dad_id, sex, phenotype.
you can probably make that by just running cut -f 1-6 on your existing ped file.

let me know if that resolves it for you. 128GB ram should be plenty for 4K samples.

It works! Thank you brentp! It looks like the required ped format is the same with the fam file.
I have another question, should I do a QC process for the sites on the VCF file, or just let it run on whole vcf file because peddy will interrogate the 23,556 sites automatically?
If QC is needed, what's the criterions to select site from the sequencing data.

Regards

Yiming

yes. that format is also called a fam file.
Just let peddy run on your full VCF, it will extract the sites that it needs automatically.

Thank you very much!