Error of Step 3

Question

Error of Step 3

zhutao1009 opened this issue 5 years ago · comments

这是我用的命令：
perl /vol6/home/quluj/zt/software/ERVcaller_v1.4/ERVcaller_v1.4.pl \
-i MD4 \
-f .fastq.gz \
-H /vol6/home/quluj/zt/pkduck_ref/PK_ref.fa \
-T /vol6/home/quluj/zt/DUCK/teseq/LTR.fasta \
-t 12 -S 20 -G
但是每次都到第三步就报错，只能得到一个空的vcf文件。
Step 3: Validation...

Converting SAM to BAM file, and then Sort and index the BAM file......

[bam_sort_core] merging from 11 files and 11 in-memory blocks...
[bwa_index] Pack FASTA... [bns_fasta2bntseq] Failed to allocate 0 bytes at bntseq.c line 303: Success
[E::bwa_idx_load_from_disk] fail to locate the index files
[E::bwa_idx_load_from_disk] fail to locate the index files

Xun Chen · Answer 1 · Thu Aug 29 2019 10:07:17 GMT+0800 (China Standard Time)

Please check your indexed human and TE reference files. You can also check if the aligned BAM file is correctly indexed and sorted, if not, you can check your installed SAMtools version which should be higher than v1.5.

Tao Zhu · Answer 2 · Tue Sep 03 2019 15:36:51 GMT+0800 (China Standard Time)

I runned ERVcaller with your test data in different servers, the ERVcaller reported same error and generated an empty vcf file, Maby you should check the file '${input_sampleID}_ERV.output', which didn't be generated as expected.
This is my code:
#!/bin/bash
conda activate R3 #activate the R3.3.2 envierment
perl /home/software/ERVcaller/ERVcaller_v1.4.pl \
-i TE_seq \
-I /home/software/ERVcaller/test/BWA/ \
-f .bam \
-H human.fa \
-T /home/software/ERVcaller/Database/HERVK.fa \
-t 2 -S 20 -G -BWA_MEM \
-l 500 \
-L 100

Xun Chen · Answer 3 · Tue Sep 03 2019 18:27:44 GMT+0800 (China Standard Time)

It works correctly on servers from many other users as well as mine. According to my experience, it usually caused by incorrect inputs. If it is easier for you, can you show the screenshots for: 1) a list of produced intermedia files (using command line: ls -lh); 2) your input paired-end reads (using command line cat); and 3) a list of generated index files for both human.fa and HERVK.fa (using command line: ls -lh).

I would suggest to use full paths and as less parameters as you can for now follow the manual of ERVcaller. For example:
perl /home/software/ERVcaller/ERVcaller_v1.4.pl
-i TE_seq
-I /home/software/ERVcaller/test/BWA/
-f .bam
-H **full paths/**human.fa
-T /home/software/ERVcaller/Database/HERVK.fa
-BWA_MEM

Tao Zhu · Answer 4 · Wed Sep 04 2019 16:41:06 GMT+0800 (China Standard Time)

(base) root@zhu-PC:/media/zhu/A64E22B94E228263/clean/humman# ll
总用量 6.9G
-rwxrwxrwx 1 zhu zhu 445 9月 4 10:52 ervcaller.sh
-rwxrwxrwx 1 zhu zhu 3.1G 9月 2 17:10 human.fa
-rwxrwxrwx 1 zhu zhu 22K 9月 4 00:20 human.fa.amb
-rwxrwxrwx 1 zhu zhu 83K 9月 4 00:20 human.fa.ann
-rwxrwxrwx 1 zhu zhu 3.1G 9月 4 00:19 human.fa.bwt
-rwxrwxrwx 1 zhu zhu 781M 9月 4 00:20 human.fa.pac
-rwxrwxrwx 1 zhu zhu 11K 9月 4 00:20 nohup.out
drwxrwxrwx 1 zhu zhu 48 9月 4 12:29 TE_seq_subgenome
drwxrwxrwx 1 zhu zhu 408 9月 4 12:29 TE_seq_temp
-rwxrwxrwx 1 zhu zhu 0 9月 4 12:29 TE_seq.vcf
###############################################
(base) root@zhu-PC:/media/zhu/A64E22B94E228263/clean/humman/TE_seq_subgenome# ll
总用量 0
############################################################
(base) root@zhu-PC:/media/zhu/A64E22B94E228263/clean/humman/TE_seq_temp# ll
总用量 4.5K
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV_1.1fuq
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV1.bian
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV_1sf.fuq
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV_2.1fuq
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV.fine_mapped
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV.hf
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV.output
-rwxrwxrwx 1 zhu zhu 925 9月 4 12:28 TE_seq_ERV.output2
-rwxrwxrwx 1 zhu zhu 0 9月 4 12:29 TE_seq_ERV.output2.1
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV.TE_f
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV.TE_f2
-rwxrwxrwx 1 zhu zhu 0 9月 4 10:54 TE_seq_ERV.visualization
-rwxrwxrwx 1 zhu zhu 32 9月 4 10:54 TE_seq.type.gz
############################################
(base) root@zhu-PC:/home/software/ERVcaller/Database# ll
总用量 976K
-rw-r--r-- 1 root root 814K 9月 1 16:37 ERV_library.fa
-rw-r--r-- 1 root root 8.2K 9月 1 16:37 HERVK.fa
-rw-r--r-- 1 root root 9 9月 2 17:20 HERVK.fa.amb
-rw-r--r-- 1 root root 42 9月 2 17:20 HERVK.fa.ann
-rw-r--r-- 1 root root 8.1K 9月 2 17:20 HERVK.fa.bwt
-rw-r--r-- 1 root root 2.0K 9月 2 17:20 HERVK.fa.pac
-rw-r--r-- 1 root root 4.1K 9月 2 17:20 HERVK.fa.sa
-rw-r--r-- 1 root root 115K 9月 1 16:37 Human_TE_library.fa
#####################################################
(base) root@zhu-PC:/home/software/ERVcaller/test/BWA# ll
总用量 15M
-rw-r--r-- 1 root root 13M 9月 1 16:37 TE_seq.bam
-rw-r--r-- 1 root root 1.2M 9月 1 16:37 TE_seq.bam.bai

Xun Chen · Answer 5 · Thu Sep 05 2019 00:01:09 GMT+0800 (China Standard Time)

It looks like it stopped very early due to the alignment or format conversion steps, which mainly use BWA and SAMtools.

Can you specify the full path for your indexed human reference genome and try again? It guess it should be here /media/zhu/A64E22B94E228263/clean/humman/ on your PC.

Let me know if it is not working. With attached log file and similar screenshot.

Tao Zhu · Answer 6 · Sat Sep 07 2019 22:30:28 GMT+0800 (China Standard Time)

It's worked with your test data and part of mine， here is the output and the TSD sequence is too long
MDM1.txt

Xun Chen · Answer 7 · Sat Sep 14 2019 14:18:26 GMT+0800 (China Standard Time)

That sounds good that the ERVcaller works. ERVcaller outputs all results, and the users can filter the results based on the reported genotype quality and likelihood. It is very useful when the users are working on a population. For the TSD sequence, based on the literature, long TSD existed. we currently use 500 bp for now to keep high sensitivity and accuracy. we may improve it in our next version.

Tao Zhu · Answer 8 · Sat Sep 14 2019 22:00:35 GMT+0800 (China Standard Time)

Thank you for your work, you helped me a lot.