Division by zero if AllocCPUs is 0
tardigradus opened this issue · comments
I seemed to have managed to cancel a job before the CPU allocation had taken place, which left AllocCPUs
equal to zero and thus led to a ZeroDivisionError
:
$ reportseff --since 2023-10-16
Error processing entry: {'AdminComment': '', 'AllocCPUS': '0', 'Elapsed': '00:00:00', 'JobID': '15053872_1491', 'JobIDRaw': '15055386', 'MaxRSS': '', 'NNodes': '1', 'REQMEM': '4G', 'State': 'CANCELLED by 324062', 'Timelimit': '04:00:00', 'TotalCPU': '00:00:00'}
Traceback (most recent call last):
File "/home/loris/.local/bin/reportseff", line 8, in <module>
sys.exit(main())
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/console.py", line 97, in main
output, entries = get_jobs(args)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/console.py", line 149, in get_jobs
raise error
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/console.py", line 146, in get_jobs
job_collection.process_entry(entry, add_job=add_jobs)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/job_collection.py", line 175, in process_entry
self.jobs[job_id].update(entry)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/job.py", line 110, in update
self._update_main_job(entry)
File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/job.py", line 147, in _update_main_job
if "TotalCPU" in entry and "AllocCPUS" in entry
ZeroDivisionError: division by zero
Should be a quick fix, I'll try to get to it in a day or two.
Thanks for the report!
Actually it seems like this was fixed in 2.7.6. Can you update and try again?
OK, something seems messed up at my end:
$ pipx upgrade reportseff
reportseff is already at latest version 2.3.1 (location: /home/loris/.local/pipx/venvs/reportseff)
I'll try to find out why pipx
is so reluctant to install the actual latest version.
EDIT: The default Python 3.6.8 on CentOS 7.9 seems cause a problem as does pipx
. With Python 3.10.8 and pip
, I could install 2.7.6 and can confirm that the division-by-zero issue has indeed been fixed.
It's because pipx is only finding python 3.6. Version 2.3 was the last with python 3.6 support. You can upgrade your base python or if you use pyenv you can specify which python to use. https://stackoverflow.com/a/69828751/12373791
I'll close this for now, but reopen if you have another issue.