error in parallel documentation
jkitchin opened this issue · comments
This example does not work as written:
parsl/docs/userguide/workflow.rst
Line 142 in 11509cb
I think it should be either:
with ParslPoolExecutor(config) as exec:
outputs = exec.map(generate, range(1, 5))
or
with ParslPoolExecutor(config) as pool:
outputs = pool.map(generate, range(1, 5))
The example also does not work for me on a Mac unless I change the import to from parsl.configs.local_threads import config
. For the given import from parsl.configs.htex_local import config
I get this error:
3 from parsl.configs.htex_local import config
----> 5 with ParslPoolExecutor(config) as exec:
6 outputs = exec.map(generate, range(1, 5))
8 for result in outputs:
File ~/anaconda3/lib/python3.10/site-packages/parsl/concurrent/__init__.py:31, in ParslPoolExecutor.__init__(self, config)
25 """Create the executor
26
27 Args:
28 config: Configuration for the Parsl Data Flow Kernel (DFK)
29 """
30 self._config = config
---> 31 self.dfk = DataFlowKernel(self._config)
32 self._app_cache: Dict[Callable, PythonApp] = {}
File ~/anaconda3/lib/python3.10/site-packages/typeguard/__init__.py:1033, in typechecked.<locals>.wrapper(*args, **kwargs)
1031 memo = _CallMemo(python_func, _localns, args=args, kwargs=kwargs)
1032 check_argument_types(memo)
-> 1033 retval = func(*args, **kwargs)
1034 try:
1035 check_return_type(retval, memo)
File ~/anaconda3/lib/python3.10/site-packages/parsl/dataflow/dflow.py:185, in DataFlowKernel.__init__(self, config)
183 self.data_manager = DataManager(self)
184 parsl_internal_executor = ThreadPoolExecutor(max_threads=config.internal_tasks_max_threads, label='_parsl_internal')
--> 185 self.add_executors(config.executors)
186 self.add_executors([parsl_internal_executor])
188 if self.checkpoint_mode == "periodic":
File ~/anaconda3/lib/python3.10/site-packages/parsl/dataflow/dflow.py:1122, in DataFlowKernel.add_executors(self, executors)
1119 self._create_remote_dirs_over_channel(executor.provider, executor.provider.channel)
1121 self.executors[executor.label] = executor
-> 1122 block_ids = executor.start()
1123 if self.monitoring and block_ids:
1124 new_status = {}
File ~/anaconda3/lib/python3.10/site-packages/parsl/executors/high_throughput/executor.py:365, in HighThroughputExecutor.start(self)
361 self._start_local_interchange_process()
363 logger.debug("Created management thread: {}".format(self._queue_management_thread))
--> 365 block_ids = self.initialize_scaling()
366 return block_ids
File ~/anaconda3/lib/python3.10/site-packages/parsl/executors/high_throughput/executor.py:320, in HighThroughputExecutor.initialize_scaling(self)
317 if self.worker_logdir_root is not None:
318 worker_logdir = "{}/{}".format(self.worker_logdir_root, self.label)
--> 320 l_cmd = self.launch_cmd.format(debug=debug_opts,
321 prefetch_capacity=self.prefetch_capacity,
322 address_probe_timeout_string=address_probe_timeout_string,
323 addresses=self.all_addresses,
324 task_port=self.worker_task_port,
325 result_port=self.worker_result_port,
326 cores_per_worker=self.cores_per_worker,
327 mem_per_worker=self.mem_per_worker,
328 max_workers=max_workers,
329 nodes_per_block=self.provider.nodes_per_block,
330 heartbeat_period=self.heartbeat_period,
331 heartbeat_threshold=self.heartbeat_threshold,
332 poll_period=self.poll_period,
333 logdir=worker_logdir,
334 cpu_affinity=self.cpu_affinity,
335 accelerators=" ".join(self.available_accelerators),
336 start_method=self.start_method)
337 self.launch_cmd = l_cmd
338 logger.debug("Launch command: {}".format(self.launch_cmd))
KeyError: 'block_id'
For the first part, that's probably something for @WardLT
For the block_id key error, that's something surprising - can you tell me the exact version of parsl that you have installer?
pip show parsl
Name: parsl
Version: 2023.7.10
Summary: Simple data dependent workflows in Python
Home-page: https://github.com/Parsl/parsl
Author: The Parsl Team
Author-email: parsl@googlegroups.com
License: Apache 2.0
Location: /Users/jkitchin/anaconda3/lib/python3.10/site-packages
Requires: dill, globus-sdk, paramiko, psutil, pyzmq, requests, setproctitle, six, tblib, typeguard, types-paramiko, types-requests, types-six, typing-extensions
Required-by:
For the block id key error:
That parameter should not be substituted at that point of execution, because the block_id in the default template uses double curly brackets:
What that error says is that for some reason the substitution code is treating that as something to substitute.
Two questions:
- Are you setting your own launch command in the htex constructor?
- Are you reusing a HighThroughputExecutor in two parsl.load calls? I have a strong suspicion that isn't detected, even though it is behaviour that isn't allowed.
I see what you mean. Yes, it does appear a double load is the issue there. In the script I had
parsl.load(config)
and with ParslPoolExecutor(config) as pool:
, which seems to cause that issue. It seems like something else is awry though.
This code:
import parsl, time
from parsl import python_app
from parsl.concurrent import ParslPoolExecutor
from parsl.configs.local_threads import config
parsl.load(config)
@python_app
def generate(limit):
from random import randint
"""Generate a random integer and return it"""
time.sleep(5)
return randint(1, limit)
with ParslPoolExecutor(config) as pool:
outputs = pool.map(generate, range(1, 11))
print([x.result() for x in outputs])
does not work if I don't load the config first, and raises RuntimeError: Must first load config
if I don't.
This code:
import parsl, time
from parsl import python_app
from parsl.concurrent import ParslPoolExecutor
from parsl.configs.htex_local import config
parsl.load(config)
@python_app
def generate(limit):
from random import randint
"""Generate a random integer and return it"""
time.sleep(5)
return randint(1, limit)
with ParslPoolExecutor(config) as pool:
outputs = pool.map(generate, range(1, 11))
print([x.result() for x in outputs])
raises KeyError: 'block_id'
if I have parsl.load(config)
, but fails differently with TypeError: Cannot subclass special typing classes
if I comment that out.
It seems like there are two issues here, the original documentation issue, and this one. Should I split this into a second issue?
That 2nd new error is a different error I've encountered before and should be fixed elsewhere - let me go find it. Something to do with package versions...
Here is the whole traceback:
Traceback (most recent call last):
File "<stdin>", line 19, in <module>
File "<stdin>", line 19, in <listcomp>
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/concurrent/__init__.py", line 74, in result_iterator
yield fs.pop().result()
File "/Users/jkitchin/anaconda3/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/Users/jkitchin/anaconda3/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/dataflow/dflow.py", line 300, in handle_exec_update
res = self._unwrap_remote_exception_wrapper(future)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/dataflow/dflow.py", line 570, in _unwrap_remote_exception_wrapper
result = future.result()
File "/Users/jkitchin/anaconda3/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/Users/jkitchin/anaconda3/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/executors/high_throughput/executor.py", line 443, in _queue_management_worker
s.reraise()
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/app/errors.py", line 123, in reraise
reraise(t, v, v.__traceback__)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/six.py", line 719, in reraise
raise value
File "/Users/jkitchin/anaconda3/bin/process_worker_pool.py", line 596, in worker
result = execute_task(req['buffer'])
File "/Users/jkitchin/anaconda3/bin/process_worker_pool.py", line 489, in execute_task
f, args, kwargs = unpack_apply_message(bufs, user_ns, copy=False)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/serialize/facade.py", line 52, in unpack_apply_message
return [deserialize(buf) for buf in unpack_buffers(packed_buffer)]
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/serialize/facade.py", line 52, in <listcomp>
return [deserialize(buf) for buf in unpack_buffers(packed_buffer)]
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/serialize/facade.py", line 95, in deserialize
result = methods_for_code[header].deserialize(body)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/parsl/serialize/concretes.py", line 89, in deserialize
return dill.loads(body)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/dill/_dill.py", line 286, in loads
return load(file, ignore, **kwds)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/dill/_dill.py", line 272, in load
return Unpickler(file, ignore=ignore, **kwds).load()
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/dill/_dill.py", line 419, in load
obj = StockUnpickler.load(self)
File "/Users/jkitchin/anaconda3/lib/python3.10/site-packages/dill/_dill.py", line 568, in _create_type
return typeobj(*args)
File "/Users/jkitchin/anaconda3/lib/python3.10/typing.py", line 348, in __init_subclass__
raise TypeError("Cannot subclass special typing classes")
TypeError: Cannot subclass special typing classes
PRs around the time of the version of parsl you are using deal with it so you might have a version that is right in the window of being broken. So if possible try a more recent parsl which should specify tighter version constraints for the troublesome packages.
Part of this issue highlights the need for parsl to be more aggressive about detecting multiple configuration uses - I'll open a feature request for that
I updated to parsl-2023.10.9. Now, I do have to load that config to avoid parsl.errors.NoDataFlowKernelError: Must first load config
, but when I do load it, I am back to KeyError: 'block_id'
. (with from parsl.configs.htex_local import config).
from parsl.configs.local_threads import config
still works fine.
Here's the automated test case that runs for the ParslExecutor feature:
34fe75f#diff-534bab1124fd7d79ea9fbe4cdcfe7321d646f99e644336e050f8189ac038e049
Are you decorating the function you are trying to pass into ParslExecutor with the python_app decorator? If so, don't do that.
I see. This works.
from parsl.concurrent import ParslPoolExecutor
from parsl.configs.htex_local import config
def generate(limit):
from random import randint
"""Generate a random integer and return it"""
return randint(1, limit)
with ParslPoolExecutor(config) as pool:
outputs = pool.map(generate, range(1, 11))
print(list(outputs))
It is still confusing in the documentation. Here
parsl/docs/userguide/workflow.rst
Line 121 in 11509cb
technically, here:
parsl/docs/userguide/workflow.rst
Line 120 in 11509cb
Ah ha, that is a problem in the docs. Sorry I missed that earlier.
Yes, generate
must be decorated when using it as an app which produces its own futures, as in
parsl/docs/userguide/workflow.rst
Line 127 in 11509cb
but not when using it as part of the "ParslPoolExecutor"
I'll make a quick PR that clears up when to decorate and not in that part of the documentation.
I updated to parsl-2023.10.9. Now, I do have to load that config to avoid parsl.errors.NoDataFlowKernelError: Must first load config, but when I do load it, I am back to KeyError: 'block_id'. (with from parsl.configs.htex_local import config).
I bet this is a result of re-using a configuration between multiple executors. Check out #2871 . I'm going to open an Issue with some ideas around how to make the error more informative