kensung-lab / SurVirus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

error during excution

RuslanAlali opened this issue · comments

The tool looks very nice and helpful. I am afraid that I need some help with it.
I tried to run the code with both BAM and FastQs. However, I keep getting an error on line 13 of isolate_relevant_pairs(_fq).

**python surveyor.py X_R1_001.fastq.gz,X_R2_001.fastq.gz  results2 hg19.fa HHV6_seq.fa HHV6_hg19.fa --fq  --dust sdust/sdust**
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 200000 sequences (30101423 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (6, 85345, 4, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (248, 318, 404)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 716)
[M::mem_pestat] mean and std.dev: (329.80, 118.11)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 872)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 200000 reads in 111.685 CPU sec, 11.204 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 10 HHV6_hg19.fa results2/head_1.fq results2/head_2.fq
[main] Real time: 62.164 sec; CPU: 116.315 sec
Executing:  isolate_relevant_pairs_fq X_R1_001.fastq.gz X_R2_001.fastq.gz hg19.fa HHV6_seq.fa results2 results2/bam_0/ 
isolate_relevant_pairs_fq: line 13: syntax error near unexpected token `gzFile,'
isolate_relevant_pairs_fq: line 13: `KSEQ_INIT(gzFile, gzread)'
Executing: bwa mem -t 10 HHV6_hg19.fa results2/bam_0//retained-pairs_1.fq results2/bam_0//retained-pairs_2.fq | samtools view -b -F 2304 > results2/bam_0//retained-pairs.remapped.bam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `results2/bam_0//retained-pairs_1.fq'.
[main_samview] fail to read the header from "-".
Executing: samtools sort -@ 10 results2/bam_0//retained-pairs.remapped.bam -o results2/bam_0//retained-pairs.remapped.cs.bam
samtools sort: failed to read header from "results2/bam_0//retained-pairs.remapped.bam"
Executing: extract_clips HHV6_seq.fa results2 results2/bam_0/
extract_clips: line 15: syntax error near unexpected token `int,'
extract_clips: line 15: `KSEQ_INIT(int, read)'

Thanks in advance

Hi,

That looks really weird. isolate_relevant_pairs_fq is a binary exec, so I am not sure why it would give a syntax error. It is clearly not being executed as a binary executable.
What happens if you run

./isolate_relevant_pairs_fq X_R1_001.fastq.gz X_R2_001.fastq.gz hg19.fa HHV6_seq.fa results2 results2/bam_0/

?

Can you also ls the content of the SurVirus folder?

It produces the same error

(mypython3) [root@ip-10-11-5-10 SurVirus]# ./isolate_relevant_pairs_fq (path)/X_R1.fasq.gz (path)/X_R2.fasq.gz hg19.fa ../../HPVDetector_v1.0/HHV/HHV6_seq results2 results2/bam_0/
./isolate_relevant_pairs_fq: line 13: syntax error near unexpected token `gzFile,'
./isolate_relevant_pairs_fq: line 13: `KSEQ_INIT(gzFile, gzread)'

As for the contents of the folder:

CMakeCache.txt  HHV6_seq.pac                 bp_region_consensus_builder.cpp      filter                          hg19.fa.ann        host_virus.fa.bwt           isolate_relevant_pairs_fq      random_pos_generator.py   sam_utils.h
CMakeFiles      HHV6_seq.sa                  build_libs.sh                        filter.cpp                      hg19.fa.bwt        host_virus.fa.fai           isolate_relevant_pairs_fq.cpp  random_pos_generator.pyc  sdust
CMakeLists.txt  HHV6_seq_no_repeat.fa        build_region-reads_associations      filter_by_qname                 hg19.fa.fai        host_virus.fa.pac           ks-test.h                      reads_categorizer         sparsehash-sparsehash-2.0.3
HHV6_seq        HHV6_seq_no_repeat.fa.fai    build_region-reads_associations.cpp  filter_by_qname.cpp             hg19.fa.pac        host_virus.fa.sa            libs                           reads_categorizer.cpp     sparsehash-sparsehash-2.0.3.zip
HHV6_seq.amb    LICENSE                      cmake_install.cmake                  get-realignments-for-qname      hg19.fa.sa         htslib-1.11                 max_is_calc.py                 remapper                  surveyor.py
HHV6_seq.ann    Makefile                     config.h                             get-realignments-for-qname.cpp  host_virus.fa      htslib-1.11.zip             max_is_calc.pyc                remapper.cpp              sw_utils.h
HHV6_seq.bwt    README.md                    extract_clips                        hg19.fa                         host_virus.fa.amb  isolate_relevant_pairs      merge_retained_reads           results                   utils.h
HHV6_seq.fai    bp_region_consensus_builder  extract_clips.cpp                    hg19.fa.amb                     host_virus.fa.ann  isolate_relevant_pairs.cpp  merge_retained_reads.cpp       results2

I think it is about installing. I got this error when I tried to cmake the script. I tried to install it on Centos server and on Ubuntu. I still have the error when compiling "NOTFOUNT".

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
HTS_LIB
    linked by target "get-realignments-for-qname" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "filter" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "remapper" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "isolate_relevant_pairs_fq" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "filter_by_qname" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "bp_region_consensus_builder" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "reads_categorizer" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "build_region-reads_associations" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "extract_clips" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "isolate_relevant_pairs" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus
    linked by target "merge_retained_reads" in directory /mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus

-- Configuring incomplete, errors occurred!
See also "/mnt/fileshare/Temp/Virus_detection/SurVirus/test2/SurVirus/CMakeFiles/CMakeOutput.log".

I am not sure what is going on, I have never seen anything like that. It seems like it failed compiling because it cannot find htslib (did it compile correctly?), yet you seem to have executables (notice that for every cpp file you have what appears to be the related executable, without extension).

It seems like the executables are being executed as bash scripts.
If you run

head isolate_relevant_pairs_fq

what do you see?