Running Peddy with Single-cell vcf and ped file
angelussong opened this issue · comments
Hi, sorry to bother you with this. So I'm working on an exploratory project trying to do some ancestry analysis based on single-cell RNA-seq data.
The metadata I have is the aligned BAM file (H10_2.bam) from an RNA-seq workflow. Then I used monovar to do variant calling, producing a .vcf output (H10_2.vcf).
Using this command, I was able to generate the .ped and .map files:
vcftools --vcf H10_2.vcf --out H10_2 --plink.
Then I was trying to run peddy using this command:
python -m peddy -p 2 --plot --prefix H10_2 H10_2.vcf.gz H10_2.ped.
And I got this error:
File "pandas/_libs/parsers.pyx", line 899, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 914, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 991, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1067, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1387, in pandas._libs.parsers.TextReader._get_column_name
IndexError: list index out of range
I read another issue about this error and it was caused by a lower version of matplotlib. I checked my OS and my matplotlib version is 2.2.3. I'm wondering what is the cause for this error.
Thank you very much!
Angelus Song
hi, is this just 1 sample in your VCF? you can do cut -f 1-6 H10_2.ped > H10_2.fixed.ped
and see if that works. it should be tab delimited and just has sample information, no genomes.
That worked! Thanks so much!