wdecoster / NanoPlot

Plotting scripts for long read sequencing data

Home Page:http://nanoplot.bioinf.be

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Nanoplot crashed with latest version

prasundutta87 opened this issue · comments

Hi @wdecoster,

Just pasting the crashing error for your reference. I concatenated the fastq.gz files from here- ftp://ftp.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/UCSC_Ultralong_OxfordNanopore_Promethion/GM24385_*.fastq.gz. I am planning to benchmark my SV calling pipeline for ONT data. After concatenation using 'cat' program, I ran Nanoplot and I got this error. I am now running NanoPlot (v1.34.1) on the three individual files now and then will merge the BAM files after alignment. But, I just wanted to know how can I deal with this error.

**2021-09-17 23:37:53,319 NanoPlot 1.34.1 started with arguments Namespace(N50=False, alength=False, bam=None, barcoded=False, color='#4CB391', colormap='Greens', cram=None, downsample=None, dpi=100, drop_outliers=False, fasta=None, fastq=['/home/u027/pdutta/Benchmarking_SVs/raw_data/GM24385.fastq.gz'], fastq_minimal=None, fastq_rich=None, feather=None, font_scale=1, format='png', hide_stats=False, huge=False, info_in_report=False, listcolormaps=False, listcolors=False, loglength=True, maxlength=None, minlength=None, minqual=None, no_N50=False, no_supplementary=False, outdir='QC_before_filtering/GM24385', path='QC_before_filtering/GM24385/', percentqual=False, pickle=None, plots=['dot'], prefix='', raw=False, readtype='1D', runtime_until=None, store=False, summary=None, threads=4, title=None, tsv_stats=True, ubam=None, verbose=False)
2021-09-17 23:37:53,319 Python version is: 3.8.10 (default, May 19 2021, 18:05:58) [GCC 7.3.0]
2021-09-17 23:37:53,320 NanoPlot: valid output format png
2021-09-17 23:37:53,335 Nanoget: Starting to collect statistics from plain fastq file.
2021-09-17 23:37:53,336 Nanoget: Decompressing gzipped fastq /home/u027/pdutta/Benchmarking_SVs/raw_data/GM24385.fastq.gz
2021-09-18 01:33:46,420 Error -3 while decompressing data: invalid literal/length code
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
return [fn(*args) for args in chunk]
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/process.py", line 198, in
return [fn(*args) for args in chunk]
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/nanoget/extraction_functions.py", line 321, in process_fastq_plain
data=[res for res in extract_from_fastq(inputfastq) if res],
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/nanoget/extraction_functions.py", line 321, in
data=[res for res in extract_from_fastq(inputfastq) if res],
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/nanoget/extraction_functions.py", line 331, in extract_from_fastq
for rec in SeqIO.parse(fq, "fastq"):
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 73, in next
return next(self.records)
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/Bio/SeqIO/QualityIO.py", line 1080, in iterate
for title_line, seq_string, quality_string in FastqGeneralIterator(handle):
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/Bio/SeqIO/QualityIO.py", line 956, in FastqGeneralIterator
for line in handle:
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/gzip.py", line 305, in read1
return self._buffer.read1(size)
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/gzip.py", line 487, in read
uncompress = self._decompressor.decompress(buf, size)
zlib.error: Error -3 while decompressing data: invalid literal/length code
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/nanoplot/NanoPlot.py", line 59, in main
datadf = get_input(
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/nanoget/nanoget.py", line 92, in get_input
dfs=[out for out in executor.map(extraction_function, files)],
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/site-packages/nanoget/nanoget.py", line 92, in
dfs=[out for out in executor.map(extraction_function, files)],
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
for element in iterable:
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/home/u027/project/software/conda/envs/SGP2/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
zlib.error: Error -3 while decompressing data: invalid literal/length code**

Regards,
Prasun

Hi Prasun,

I think this suggests that your fastq file is corrupted.

Wouter

Hi @wdecoster ,

You may be right. The cat command somehow corrupts the fastq file when concatenation. I tried it two times and got the same error. I just processed the files individually, hoping to concatenate the BAM files downstream, it worked perfectly.

Regards,
Prasun

Hmmm, cat usually works for me for fastq.gz files. Which command do you use exactly? How did you compress these files?

I did not compress the files. They were already compressed by GIAB. I just downloaded them using the wget command from the shared link above and used cat *.fastq.gz > <merged.fastq.gz>