Parsl / parsl

Parsl - a Python parallel scripting library

Home Page:http://parsl-project.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

monitoring_filesystem_radio log fills rapidly when using MonitoringHub with HTEX executor

raymondEhlers opened this issue · comments

Describe the bug
When I enable the MonitoringHub to monitor tasks, it appears to start the filesystem radio, which then proceeds to log a message every second noting that it has started. The repeated log message is here:

while True: # this loop will end on process termination
logger.info("Start filesystem radio receiver loop")
# iterate over files in new_dir
for filename in os.listdir(new_dir):
try:
logger.info(f"Processing filesystem radio file {filename}")
full_path_filename = f"{new_dir}/{filename}"
with open(full_path_filename, "rb") as f:
message = deserialize(f.read())
logger.info(f"Message received is: {message}")
assert isinstance(message, tuple)
q.put(cast(AddressedMonitoringMessage, message))
os.remove(full_path_filename)
except Exception:
logger.exception(f"Exception processing {filename} - probably will be retried next iteration")
time.sleep(1) # whats a good time for this poll?

which seems to be called unconditionally when the MonitoringHub is started:

self.filesystem_proc = Process(target=filesystem_receiver,
args=(self.logdir, self.resource_msgs, run_dir),
name="Monitoring-Filesystem-Process",
daemon=True
)

To Reproduce
Steps to reproduce the behavior:

  1. Setup Parsl 2023.07.24 with Python 3.11 on cluster
  2. Run a test script with the HTEX executor with MonitoringHub enabled
  3. Wait a few minutes
  4. Check size of runinfo/__run_number__/monitoring_filesystem_radio.log, and see it increasing every second

It's not a huge problem, especially if runs are short, but for long runs, it means that I have to do manual cleanup for a message that doesn't seem to be critical

Expected behavior
A clear and concise description of what you expected to happen.

Remove the log message. In any case, I'm not sure if this is supposed to be started unconditionally since if I follow correctly, using the HTEX uses the HTEX radio, not the filesystem(?). But I'm not so familiar with the logger system in parsl. If you plan to just remove the message, I can open a trivial PR if helpful - as you like.

Thanks!

Environment

  • OS: [e.g. ubuntu, centos, MacOS, windows]
  • Python version
  • Parsl version

SL 7, python 3.11, parsl 2023.07.24

Distributed Environment

  • Where are you running the Parsl script from ? [e.g. Laptop/Workstation, Login node, Compute node]
  • Where do you need the workers to run ? [e.g. Same as Parsl script, Compute nodes, Cloud nodes]