non-deterministic error, VCF file line contains an unexpected number of fields
williambrandler opened this issue · comments
I am running bedtools on a vcf located in amazon web services S3,
accessing it as if it was on the local filesystem via the databricks file system mount ("/dbfs/mnt/")
I have never had issues with bedtools doing this before, but I am now hitting the following error:
Error: line number 98 of file /dbfs/mnt/test.vcf has 14 fields, but 11 were expected.
If I run it again, I get the same error but on a different line
Have you seen this issue before? Any idea what could cause it?
There is no problem with these lines in the VCF
code:
%sh
input_vcf_local_path=/dbfs/mnt/test.vcf
bedtools intersect -seed 24 -a $input_vcf_local_path -b $bed_local_path -header -wa > $bedtools_filter_vcf_local_path
The issue was $input_vcf_local_path = $bedtools_filter_vcf_local_path
so bedtools was writing to the same output file as the input