troycomi / reportseff

Tabular seff

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Division by zero if AllocCPUs is 0

tardigradus opened this issue · comments

I seemed to have managed to cancel a job before the CPU allocation had taken place, which left AllocCPUs equal to zero and thus led to a ZeroDivisionError:

$ reportseff --since 2023-10-16                                                                                                                                      
Error processing entry: {'AdminComment': '', 'AllocCPUS': '0', 'Elapsed': '00:00:00', 'JobID': '15053872_1491', 'JobIDRaw': '15055386', 'MaxRSS': '', 'NNodes': '1', 'REQMEM': '4G', 'State': 'CANCELLED by 324062', 'Timelimit': '04:00:00', 'TotalCPU': '00:00:00'}
Traceback (most recent call last):
  File "/home/loris/.local/bin/reportseff", line 8, in <module>
    sys.exit(main())
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/console.py", line 97, in main
    output, entries = get_jobs(args)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/console.py", line 149, in get_jobs
    raise error
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/console.py", line 146, in get_jobs
    job_collection.process_entry(entry, add_job=add_jobs)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/job_collection.py", line 175, in process_entry
    self.jobs[job_id].update(entry)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/job.py", line 110, in update
    self._update_main_job(entry)
  File "/home/loris/.local/pipx/venvs/reportseff/lib64/python3.6/site-packages/reportseff/job.py", line 147, in _update_main_job
    if "TotalCPU" in entry and "AllocCPUS" in entry
ZeroDivisionError: division by zero

Should be a quick fix, I'll try to get to it in a day or two.

Thanks for the report!

Actually it seems like this was fixed in 2.7.6. Can you update and try again?

OK, something seems messed up at my end:

$ pipx upgrade reportseff
reportseff is already at latest version 2.3.1 (location: /home/loris/.local/pipx/venvs/reportseff)

I'll try to find out why pipx is so reluctant to install the actual latest version.

EDIT: The default Python 3.6.8 on CentOS 7.9 seems cause a problem as does pipx. With Python 3.10.8 and pip, I could install 2.7.6 and can confirm that the division-by-zero issue has indeed been fixed.

It's because pipx is only finding python 3.6. Version 2.3 was the last with python 3.6 support. You can upgrade your base python or if you use pyenv you can specify which python to use. https://stackoverflow.com/a/69828751/12373791

I'll close this for now, but reopen if you have another issue.