luntergroup / octopus

Bayesian haplotype-based mutation calling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Forest file is still not available for donwloading "std::bad_alloc Error"

hfl112 opened this issue · comments

octopus v0.7.4
system: HPC run with 4cpus and 16G memory

source /home/hef/Tools/miniconda3/etc/profile.d/conda.sh;conda activate octopus octopus -R Homo_sapiens_assembly38.fasta -I 466_WES.dedupped.realigned.recal.bam -C cancer --min-mapping-quality 10 --sequence-error-model PCR-FREE.HISEQ-4000 --forest germline.v0.7.4.forest.gz --somatic-forest somatic.v0.7.4.forest.gz --threads 4 -o octopus.2.vcf
Error:
[2023-10-19 21:51:21] chr6_GL000252v2_alt:4604811 97.4% 8h 12m 13m 28s
[2023-10-19 21:51:21] chr6_GL000254v2_alt:4827813 97.5% 8h 12m 12m 25s
[2023-10-19 21:51:22] chr6_GL000255v2_alt:4606388 97.7% 8h 12m 11m 54s
[2023-10-19 21:51:22] chr6_GL000256v2_alt:4929269 97.8% 8h 12m 10m 51s
[2023-10-19 21:59:38] - 100% 8h 20m -
[2023-10-19 21:59:51] Starting Call Set Refinement (CSR) filtering
[2023-10-19 21:59:53] Removed 6209 temporary files
[2023-10-19 21:59:54] A program error has occurred:
[2023-10-19 21:59:54]
[2023-10-19 21:59:54] Encountered an exception during calling 'std::bad_alloc'. This means
[2023-10-19 21:59:54] there is a bug and your results are untrustworthy.
[2023-10-19 21:59:54]
[2023-10-19 21:59:54] To help resolve this error run in debug mode and send the log file to
[2023-10-19 21:59:54] https://github.com/luntergroup/octopus/issues.
[2023-10-19 21:59:54] ------------------------------------------------------------------------
srun: error: c008: task 0: Exited with exit code 1

I tried to gunzip forest.gz(as suggested here:#163), but it's not allowed

Any suggestions for this error?

I guess it' because of the somatic/germline forest gz files, if I remove those parameter, it works

I think there is something wrong with the gz files:

Downloading resources/forests/germline.v0.7.4.forest.gz (374 MB)
Error downloading object: resources/forests/germline.v0.7.4.forest.gz (926866d): Smudge error: Error downloading resources/forests/germline.v0.7.4.forest.gz (926866d922430204e527e685559373066801e1b288835480a44b0bc07ae8fe3d): batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

image

Hi
Any luck with getting the forest files? I see a similar issue with my data. Without forest files it works but I use the forest files it gives me the same error.

Hi Any luck with getting the forest files? I see a similar issue with my data. Without forest files it works but I use the forest files it gives me the same error.

not yet…

The forest files in the repo are not valid gzip files, and are apparently useless placeholders:

$ git clone https://github.com/luntergroup/octopus.git
Cloning into 'octopus'...
remote: Enumerating objects: 48448, done.
remote: Counting objects: 100% (3958/3958), done.
remote: Compressing objects: 100% (1421/1421), done.
remote: Total 48448 (delta 2403), reused 3653 (delta 2196), pack-reused 44490
Receiving objects: 100% (48448/48448), 139.82 MiB | 25.65 MiB/s, done.
Resolving deltas: 100% (37584/37584), done.
[00:12:05 ~]$ file octopus/resources/forests/germline.v0.7.4.forest.gz
octopus/resources/forests/germline.v0.7.4.forest.gz: ASCII text
[00:12:18 ~]$ cat octopus/resources/forests/germline.v0.7.4.forest.gz
version https://git-lfs.github.com/spec/v1
oid sha256:926866d922430204e527e685559373066801e1b288835480a44b0bc07ae8fe3d
size 373801411

Please see #259 (comment). Still could be a problem because data quota

Indeed the data quota error message is what I'm seeing. Is there something wrong with using Zenodo? Or can someone mirror the files somewhere else temporarily?