Some tensorflow warning/error messages when running Helixer via Singularity

Question

Some tensorflow warning/error messages when running Helixer via Singularity

spoonbender76 opened this issue 4 months ago · comments

Hi, I'm running Helixer v0.3.3 via Singularity v4.0.3
singularity pull docker://gglyptodon/helixer-docker:helixer_v0.3.3_cuda_11.8.0-cudnn8
singularity run --nv helixer-docker_helixer_v0.3.3_cuda_11.8.0-cudnn8.sif Helixer.py --fasta-path Nm.softmasked.fa --lineage invertebrate --gff-output-path Nm_helixer.gff3 --batch-size 8

Can I safely ignore these TensorFlow warnings/error messages, or might they affect performance/results?

setting self.n_seqs to 4932, bc that is len of data/X
2024-04-11 11:40:25.627235: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8906
2024-04-11 11:40:28.342978: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: Permission denied
2024-04-11 11:40:28.344534: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: Permission denied
2024-04-11 11:40:28.344551: W tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:109] Couldn't get ptxas version : FAILED_PRECONDITION: Couldn't get ptxas/nvlink version string: INTERNAL: Couldn't invoke ptxas --version
2024-04-11 11:40:28.346127: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: Permission denied
2024-04-11 11:40:28.346192: W tensorflow/compiler/xla/stream_executor/gpu/redzone_allocator.cc:318] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.

Will Haese-Hill · Answer 1 · Fri Apr 19 2024 22:55:57 GMT+0800 (China Standard Time)

I'm encountering similar when training a model with HybridModel.py using Apptainer v1.1.8 (rebranded Singularity) with the latest Docker container (helixer-docker:helixer_v0.3.3_cuda_11.8.0-cudnn8). Now worrying that my model training is running sub-optimally (i.e. slow), so would appreciate a response.

Jiangjiangzhang6 · Answer 2 · Tue May 21 2024 08:41:31 GMT+0800 (China Standard Time)

I met the same error,.

I'm encountering similar when training a model with HybridModel.py using Apptainer v1.1.8 (rebranded Singularity) with the latest Docker container (helixer-docker:helixer_v0.3.3_cuda_11.8.0-cudnn8). Now worrying that my model training is running sub-optimally (i.e. slow), so would appreciate a response.

i met the same error , but i didnot have the root ,just to use the singularity,

Alisandra Denton · Answer 3 · Sun Jun 02 2024 15:15:15 GMT+0800 (China Standard Time)

Hi, thanks for raising, will check out these errors more closely for the next release. I strongly suspect you can ignore them.

Helixer should run on the order of magnitude of 100mbp of genome/30min (or faster, hardware, batch size and gene density dependent). If it's much slower than that, then please let us know, that would be unexpectedly slow and might be running on the CPU instead of GPU.