kensung-lab / SurVirus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Virus-side.fq and getting killed

Saurav-g-hub opened this issue · comments

Dear Authors,

We are trying to run Survirus on vertebrate host and getting the error below
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[main] CMD: bwa mem -t 80 /home/saurav.mandal/survirus/virus_index/retroviral_11632.fasta /data_disk2/saurav/results//bam_
0//host-clips.unmapped.fa
[main] Real time: 4.422 sec; CPU: 4.007 sec
Killed
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `/data_disk2/saurav/results//bam_0//virus-side.fq'.
/data_disk/saurav/reads/1.fq
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Can you provide the line number that generates the virus-side.fq in survayer.py file...What may be the possible reason for the virus-side.fq file not getting generated. Also please not that the host-clips.unmapped.fa file gets generated but why it shows Killed.

Regards,
Saurav

Hi,

Normally SurVirus would print the commands it executes to standard output. That way you would be able to pinpoint what command was killed.
As for why your system is killing the process, there are various possible reasons. Perhaps the most common cause is that the process exceeded the available resources.

Hi, thanks for quick response.

I have executed bwa mem command individually and it works fine without getting killed. I think that code is getting killed because of some other reasons. I have used print command to check where exactly the problem lies and it is somewhere around here:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

reference_fa = pyfaidx.Fasta(cmd_args.virus_reference)
n_viruses = len(reference_fa.keys())

bam_workspaces = []
if cmd_args.fq:
bam_workspace = "%s/bam_0/" % (cmd_args.workdir)
if not os.path.exists(bam_workspace):
os.makedirs(bam_workspace)
bam_workspaces.append(bam_workspace)

max_read_len, max_is = \
    max_is_calc.get_max_is_from_fq(cmd_args.workdir, input_names[0], input_names[1], cmd_args.host_and_virus_reference, \
                                        cmd_args.bwa, cmd_args.threads)
with open("%s/stats.txt" % bam_workspace, "w") as stat_file:
    stat_file.write("max_is %d\n" % max_is)
config_file.write("read_len %d\n" % max_read_len)
config_file.close();
print("****1****")
isolate_cmd = "%s/isolate_relevant_pairs_fq %s %s %s %s %s %s " % \
              (SURVIRUS_PATH, input_names[0], input_names[1], cmd_args.host_reference,
               cmd_args.virus_reference, cmd_args.workdir, bam_workspace)
execute(isolate_cmd)
print("****2****")

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Also, why the virus-side.fq file is not generated. It generates segmentation fault. I am using vertebrate genome as my reference genome. Please see the output below,
+++++++++++++++++++++++++++log_file++++++++++++++++++++++++++++++++++++++++++++++
[main] Real time: 7.269 sec; CPU: 3.153 sec
[M::bam2fq_mainloop] discarded 0 singletons
[M::bam2fq_mainloop] processed 31105 reads
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 31105 sequences (871654 bp)...
[M::mem_process_seqs] Processed 31105 reads in 0.982 CPU sec, 0.021 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 60 /home/saurav/survirus/virus_index/retroviral_11632.fasta /data_disk2/saurav/results/bam_0/host-clips.unmapped.fa
[main] Real time: 4.451 sec; CPU: 3.870 sec
Killed
/data_disk/saurav/reads/1.fq
[print *********]
.
.
.
.
Executing: bwa mem -t 60 -h 1124429 /home/saurav/survirus/virus_host_db/_host+virus.fa /data_disk2/saurav/resu[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file /data_disk2/saurav/results/bam_0/virus-side.fq'. [M::bwa_idx_load_from_disk] read 0 ALT contigs [E::main_mem] fail to open file /data_disk2/saurav/results/bam_0/host-side.fq'.
Segmentation fault (core dumped)
[E::hts_open_format] Failed to open file "/data_disk2/saurav/results/host-side.cs.bam" : No such file or directory
terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits, std::allocator >'
Aborted (core dumped)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

If you need more information I can provide. It will be really helpful if this can be made working for vertebrate genome.
Thank you.

No, I don't think BWA is getting killed.
I am assuming it is one of the SurVirus executables. Could you show the whole output?
Also, are you separating std out and std err?