diazale / 1KGP_dimred

Interactive demonstration of how to use PCA, t-SNE, and UMAP on genotype data from the Thousand Genome Project.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding a personal genome omni-chip individual

avilella opened this issue · comments

Hi,

I read the tweet about 1KGP_dimred https://twitter.com/adp_diaz/status/1044235174718951425 and I would like to know how difficult would it be to do the following:

(1) Take the omni file, since the personal genome I want to add is from MyHeritageDNA which is Illumina's 700K Omni chip. I presume it's the file below:
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.vcf.gz
Would that file load if substituted to the one you point at in the ipynb?
(2) Add more individuals to the .vcf.gz file:
Given the format needed for the ipynb, what would be the best way to add extra individuals to the file, re-index, append the file to the .panel and tabular info files, and run the ipynb again?

(1) Runs fine on my machine, so just plug-and-play (well, it imports anyway - I'm assuming the data got in there alright)
(2) I'm not sure. When I combined the HRS and 1KGP datasets I ran an intersection in PLINK. I think they'd have to be the same chip though.