brentp / peddy

genotype :: ped correspondence check, ancestry check, sex check. directly, quickly on VCF

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parallel processing blocked

MattWellie opened this issue · comments

I've run into an issue trying to parallelize code using Peddy. I'm parsing a PED file and VCF, then splitting all variants into groups to process. This splitting makes it a strong candidate for parallelization, but I can't pickle the Ped() object, so multiprocessing is blocked.

_pickle.PicklingError: Can't pickle <class 'peddy.peddy.UNKNOWN'>: it's not the same object as peddy.peddy.UNKNOWN

Probably relating to the handling of unknown members in the Pedigree:
https://github.com/brentp/peddy/blob/master/peddy/peddy.py#L102-L104

This is completely non-urgent, and I'll see if I can work out a fix which can be ported upstream

It might be because of this: https://github.com/brentp/peddy/blob/master/peddy/peddy.py#L133

Instead of working on peddy, you might try somalier it's faster and scales to more samples.