Running Peddy with Single-cell vcf and ped file

Question

Running Peddy with Single-cell vcf and ped file

angelussong opened this issue 5 years ago · comments

Hi, sorry to bother you with this. So I'm working on an exploratory project trying to do some ancestry analysis based on single-cell RNA-seq data.

The metadata I have is the aligned BAM file (H10_2.bam) from an RNA-seq workflow. Then I used monovar to do variant calling, producing a .vcf output (H10_2.vcf).

Using this command, I was able to generate the .ped and .map files:
vcftools --vcf H10_2.vcf --out H10_2 --plink.

Then I was trying to run peddy using this command:
python -m peddy -p 2 --plot --prefix H10_2 H10_2.vcf.gz H10_2.ped.

And I got this error:

File "pandas/_libs/parsers.pyx", line 899, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 914, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 991, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1067, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1387, in pandas._libs.parsers.TextReader._get_column_name
IndexError: list index out of range

I read another issue about this error and it was caused by a lower version of matplotlib. I checked my OS and my matplotlib version is 2.2.3. I'm wondering what is the cause for this error.

Thank you very much!

Angelus Song

Brent Pedersen · Answer 1 · Tue Jul 02 2019 07:29:58 GMT+0800 (China Standard Time)

hi, is this just 1 sample in your VCF? you can do cut -f 1-6 H10_2.ped > H10_2.fixed.ped and see if that works. it should be tab delimited and just has sample information, no genomes.

Hanbing Song · Answer 2 · Wed Jul 03 2019 00:32:48 GMT+0800 (China Standard Time)

That worked! Thanks so much!