Download of required starts but doesn't complete
wudustan opened this issue · comments
I've been using the nf-core rnafusion workflow and the standalone fusion_report download --cosmic_usr '<username>' --cosmic_passwd '<password>' /path/to/db/
command.
In each case the download starts and then hangs around "Downloading FusionGDB2_id.xlsx" and fails to proceed from there. No error messages are returned, just a process that eventually times out if used on a HPC with time limits on processes.
Tested with version fusion-report 2.1.5
Hi @wudustan, can you check if you can download the file on the server manually?
wget https://compbio.uth.edu/FusionGDB2/tables/FusionGDB2_id.xlsx -O FusionGDB2_id.xlsx
wget https://compbio.uth.edu/FusionGDB2/tables/FusionGDB2_id.xlsx -O FusionGDB2_id.xlsx
--2022-06-09 11:05:03-- https://compbio.uth.edu/FusionGDB2/tables/FusionGDB2_id.xlsx
Resolving compbio.uth.edu (compbio.uth.edu)... 129.106.32.59
Connecting to compbio.uth.edu (compbio.uth.edu)|129.106.32.59|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4796829 (4.6M) [application/vnd.openxmlformats-officedocument.spreadsheetml.sheet]
Saving to: 'FusionGDB2_id.xlsx'
FusionGDB2_id.xlsx 100%[=====================================================================================================================>] 4.57M 578KB/s in 9.3s
2022-06-09 11:05:16 (502 KB/s) - 'FusionGDB2_id.xlsx' saved [4796829/4796829]
Yep seems like it works OK. It also dowloads fine as far as I can tell during the full function call, but then doesn't really move on from there.
ls
AUTHREF.JSON.SCHEMA CYTOCONVERTED_LOG.TXT.DATA KBIT.TXT.DATA MCABNORM.JSON.SCHEMA RECABNUM.TXT.DATA fusiongdb.db
AUTHREF.TXT.DATA CYTOGEN.JSON.SCHEMA KBREAK.JSON.SCHEMA MCABNORM.TXT.DATA REF.JSON.SCHEMA fusiongdb2.db
CYTINV.JSON.SCHEMA CYTOGEN.TXT.DATA KBREAK.TXT.DATA MCBREAK.JSON.SCHEMA REF.TXT.DATA mitelman.db
CYTINV.TXT.DATA CYTOVAL.JSON.SCHEMA KCLONE.JSON.SCHEMA MCBREAK.TXT.DATA TCGA_ChiTaRS_combined_fusion_ORF_analyzed_gencode_h19v19.txt mitelman_db.zip
CYTOBANDS_HG38.JSON.SCHEMA CYTOVAL.TXT.DATA KCLONE.TXT.DATA MCGENE.JSON.SCHEMA TCGA_ChiTaRS_combined_fusion_information_on_hg19.txt uniprot_gsymbol.txt
CYTOBANDS_HG38.TXT.DATA FusionGDB2_id.xlsx KODER.JSON.SCHEMA MCGENE.TXT.DATA fgene_disease_associations.txt
CYTOCONVERTED.JSON.SCHEMA KABNORM.JSON.SCHEMA KODER.TXT.DATA RECAB.JSON.SCHEMA fusionGDB2.csv
CYTOCONVERTED.TXT.DATA KABNORM.TXT.DATA MBCA.JSON.SCHEMA RECAB.TXT.DATA fusion_ppi.txt
CYTOCONVERTED_LOG.JSON.SCHEMA KBIT.JSON.SCHEMA MBCA.TXT.DATA RECABNUM.JSON.SCHEMA fusion_uniprot_related_drugs.txt
ftp://ftp1.nci.nih.gov/pub/CGAP/mitelman.tar.gz
currently 404s also
Seems like you are using older version of fusion-report
@wudustan. Update to the latest one, the Mitelman database has been moved to google storage that's why you have to use the latest release.
Hi, thanks for the heads up. I've updated using python3 setup.py install
I'm now getting a new error:
(fusion_report) [user] Fusion_Report % fusion_report download --cosmic_usr "usr" --cosmic_passwd "passwd" .
Downloading resources...
Downloading mitelman_db.zip
Traceback (most recent call last):
File "[user]miniconda3/envs/fusion_report/bin/fusion_report", line 4, in <module>
__import__('pkg_resources').run_script('fusion-report==2.1.5', 'fusion_report')
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/pkg_resources/__init__.py", line 662, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1459, in run_script
exec(code, namespace, namespace)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/fusion_report-2.1.5-py3.8.egg/EGG-INFO/scripts/fusion_report", line 13, in <module>
app.run()
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/fusion_report-2.1.5-py3.8.egg/fusion_report/app.py", line 71, in run
Download(params)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/fusion_report-2.1.5-py3.8.egg/fusion_report/download.py", line 22, in __init__
self.download_all(params)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/fusion_report-2.1.5-py3.8.egg/fusion_report/download.py", line 41, in download_all
Net.get_fusiongdb(self, return_err)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/site-packages/fusion_report-2.1.5-py3.8.egg/fusion_report/common/net.py", line 106, in get_fusiongdb
pool = Pool(Settings.THREAD_NUM)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/pool.py", line 212, in __init__
self._repopulate_pool()
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static
w.start()
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "[user]miniconda3/envs/fusion_report/lib/python3.8/multiprocessing/spawn.py", line 183, in get_preparation_data
main_mod_name = getattr(main_module.__spec__, "name", None)
AttributeError: module '__main__' has no attribute '__spec__'
Something doesn't add up here. Could you try installing the latest dev
branch instead?
I've managed to get this to work on a local machine with a Docker instance instead of installing the required software from conda etc.
Anyone else having this issue on HPC, try to pull to pull the fusion-report references via a Docker on your local machine first.
OK to close issue.