ipython / ipyparallel

IPython Parallel: Interactive Parallel Computing in Python

Home Page:https://ipyparallel.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'.ipython/profile_ssh' not found

xiedidan opened this issue · comments

I'm trying to start cluster with ssh. The following error message appears in engine log:

2021-12-08 00:42:19.562 [IPEngine] Config changed: {'ProfileDir': {'location': '.ipython/profile_ssh'}, 'IPEngine': {'work_dir': '/home/xd/project/Finance/quant_v1', 'profile': 'ssh'}, 'Session': {'key': b'8dc49daa-41cace1936b400470864d3d2', 'signature_scheme': 'hmac-sha256', 'packer': 'json', 'unpacker': 'json'}, 'IPKernelApp': {'exec_lines': [], 'exec_files': []}, 'HistoryManager': {'hist_file': ':memory:'}}
2021-12-08 00:42:19.562 [IPEngine] CRITICAL | Profile directory '.ipython/profile_ssh' not found.

It looks like I should set absolute path for 'ProfileDir': {'location': '.ipython/profile_ssh'}, but I don't know how to do that.
I've tried setting c.ProfileDir.location = '/home/xd/.ipython/profile_ssh' in ipcluster_config.py with no luck...
Thanks.

Can you include more code to reproduce the issue? Have you created the profile already (ipython profile create ssh)?

Can you include more code to reproduce the issue? Have you created the profile already (ipython profile create ssh)?

profile_ssh.tar.gz

profile_ssh_218.tar.gz

Yes profile_ssh is created (profile_ssh.tar.gz). I'm trying to run controller on 192.168.5.71 and engines both on 192.168.5.71 and 192.168.5.218. Profile dir has been copied to 218 automatically (please look into profile_ssh_218.tar.gz). And controller tried to start engines on 218, but all the engines failed with '.ipython/profile_ssh' not found error.

In jupyterlab I have following code to start a cluster:

import ipyparallel as ipp
cluster = ipp.Cluster(profile_dir='/home/xd/.ipython/profile_ssh', cluster_id='')
client = cluster.start_and_connect_sync()

And cell outputs:

Starting 8 engines with <class 'ipyparallel.cluster.launcher.SSHEngineSetLauncher'>
[ProfileCreate] Generating default config file: '.ipython/profile_ssh/ipython_config.py'
[ProfileCreate] Generating default config file: '.ipython/profile_ssh/ipython_kernel_config.py'
ensuring remote 192.168.5.218:.ipython/profile_ssh/security/ exists
sending /home/xd/.ipython/profile_ssh/security/ipcontroller-client.json to 192.168.5.218:.ipython/profile_ssh/security/ipcontroller-client.json
ensuring remote 192.168.5.218:.ipython/profile_ssh/security/ exists
sending /home/xd/.ipython/profile_ssh/security/ipcontroller-engine.json to 192.168.5.218:.ipython/profile_ssh/security/ipcontroller-engine.json
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
fetching /tmp/tmpv0virpoo/ipengine-1638895319.8639.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895319.8639.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895319.8639.out
fetching /tmp/tmpcvb5rpr7/ipengine-1638895323.9537.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895323.9537.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895323.9537.out
fetching /tmp/tmpado3ikz7/ipengine-1638895327.8639.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895327.8639.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895327.8639.out
fetching /tmp/tmp_fmgo0tg/ipengine-1638895331.7824.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895331.7824.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895331.7824.out
fetching /tmp/tmpwr984gkf/ipengine-1638895336.0938.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895336.0938.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895336.0938.out
fetching /tmp/tmpv1iuy8zz/ipengine-1638895340.0645.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895340.0645.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895340.0645.out
fetching /tmp/tmpbxqoopnj/ipengine-1638895344.0524.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895344.0524.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895344.0524.out
fetching /tmp/tmpcpimll00/ipengine-1638895348.0458.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895348.0458.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895348.0458.out
engine set stopped 1638895313: {'engines': {'192.168.5.218/0': {'exit_code': -1, 'pid': 26085, 'identifier': '192.168.5.218/0'}, '192.168.5.218/1': {'exit_code': -1, 'pid': 26305, 'identifier': '192.168.5.218/1'}, '192.168.5.218/2': {'exit_code': -1, 'pid': 26528, 'identifier': '192.168.5.218/2'}, '192.168.5.218/3': {'exit_code': -1, 'pid': 26750, 'identifier': '192.168.5.218/3'}, '192.168.5.218/4': {'exit_code': -1, 'pid': 26973, 'identifier': '192.168.5.218/4'}, '192.168.5.218/5': {'exit_code': -1, 'pid': 27193, 'identifier': '192.168.5.218/5'}, '192.168.5.218/6': {'exit_code': -1, 'pid': 27415, 'identifier': '192.168.5.218/6'}, '192.168.5.218/7': {'exit_code': -1, 'pid': 27637, 'identifier': '192.168.5.218/7'}}, 'exit_code': -1}

I have to quickly cat ipengine-xxx.xxx.out files since they got removed right after process exits.
I tried to run /home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh manaually on 218, it looks things goes well in this way:

(base) xd@xd-supermicro:~$ /home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh
2021-12-08 22:07:52.112 [IPEngine] IPYTHONDIR set to: /home/xd/.ipython
2021-12-08 22:07:52.114 [IPEngine] Using existing profile dir: '/home/xd/.ipython/profile_ssh'
2021-12-08 22:07:52.115 [IPEngine] Searching path ['/home/xd', '/home/xd/.ipython/profile_ssh', '/home/xd/anaconda3/envs/finance/etc/ipython', '/usr/local/etc/ipython', '/etc/ipython'] for config files
2021-12-08 22:07:52.116 [IPEngine] Attempting to load config file: ipython_config.py
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /etc/ipython
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /usr/local/etc/ipython
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /home/xd/anaconda3/envs/finance/etc/ipython
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /home/xd/.ipython/profile_ssh
2021-12-08 22:07:52.118 [IPEngine] Loaded config file: /home/xd/.ipython/profile_ssh/ipython_config.py
2021-12-08 22:07:52.118 [IPEngine] Looking for ipython_config in /home/xd
2021-12-08 22:07:52.119 [IPEngine] Attempting to load config file: ipengine_config.py
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /etc/ipython
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /usr/local/etc/ipython
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /home/xd/anaconda3/envs/finance/etc/ipython
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /home/xd/.ipython/profile_ssh
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /home/xd
2021-12-08 22:07:52.125 [IPEngine] Changing to working dir: /home/xd/project/Finance/quant_v1
2021-12-08 22:07:52.126 [IPEngine] Loading connection file '/home/xd/.ipython/profile_ssh/security/ipcontroller-engine.json'
2021-12-08 22:07:52.138 [IPEngine] WARNING | Not using CurveZMQ security
2021-12-08 22:07:52.141 [IPEngine] Config changed:
2021-12-08 22:07:52.141 [IPEngine] {'IPEngine': {'work_dir': '/home/xd/project/Finance/quant_v1', 'profile': 'ssh'}, 'Session': {'key': b'8dc49daa-41cace1936b400470864d3d2', 'signature_scheme': 'hmac-sha256', 'packer': 'json', 'unpacker': 'json'}}
2021-12-08 22:07:52.143 [IPEngine] Registering with controller at tcp://192.168.5.71:50813
2021-12-08 22:07:52.149 [IPEngine] Shell_addrs: ['tcp://192.168.5.71:33927', 'tcp://192.168.5.71:40413', 'tcp://192.168.5.71:41163']
2021-12-08 22:07:52.150 [IPEngine] Setting shell identity b'0a2083d1-cf141668cbaa7ca3c7048532'
2021-12-08 22:07:52.150 [IPEngine] Connecting shell to tcp://192.168.5.71:33927
2021-12-08 22:07:52.150 [IPEngine] Connecting shell to tcp://192.168.5.71:40413
2021-12-08 22:07:52.151 [IPEngine] Connecting shell to tcp://192.168.5.71:41163
2021-12-08 22:07:52.151 [IPEngine] Starting nanny
2021-12-08 22:07:53.082 [KernelNanny.8] Starting kernel nanny for engine 8, pid=7245, nanny pid=7250
2021-12-08 22:07:53.087 [KernelNanny.8] Nanny watching parent pid 7245.
2021-12-08 22:07:53.098 [IPEngine] Seeing logger to stderr, rerouting to raw filedescriptor.
2021-12-08 22:07:53.172 [IPEngine] Config changed: {'IPEngine': {'work_dir': '/home/xd/project/Finance/quant_v1', 'profile': 'ssh'}, 'Session': {'key': b'8dc49daa-41cace1936b400470864d3d2', 'signature_scheme': 'hmac-sha256', 'packer': 'json', 'unpacker': 'json'}, 'IPKernelApp': {'exec_lines': [], 'exec_files': []}, 'HistoryManager': {'hist_file': ':memory:'}}
2021-12-08 22:07:53.173 [IPEngine] IPYTHONDIR set to: /home/xd/.ipython
2021-12-08 22:07:53.175 [IPEngine] Using existing profile dir: '/home/xd/.ipython/profile_default'
2021-12-08 22:07:53.179 [IPEngine] WARNING | debugpy_stream undefined, debugging will not be enabled
2021-12-08 22:07:53.183 [IPEngine] Starting to monitor the heartbeat signal from the hub every 3500 ms.
2021-12-08 22:07:53.184 [IPEngine] Completed registration with id 8

Now I'm stuck. As you can see from engine logs, engine started by controller automatically sets 'ProfileDir': {'location': '.ipython/profile_ssh'}, while manual started engine does not. I think that might be the problem but I don't know how to solve it.

I'm sorry, I was sure I wrote a reply to this some time ago.

I believe the crux is the combination of relative profile directory and work directory,
so it is probably looking for the profile in /home/xd/project/Finance/quant_v1/.ipython/profile_ssh instead of $HOME/.ipython/profile_ssh.

If you can get away with not specifying work_dir (e.g. os.chdir at the beginning of your code), I bet it will work while we figure out what to fix.

I have to quickly cat ipengine-xxx.xxx.out files since they got removed right after process exits.

These files are removed because the output is already retrieved. You can view it with cluster.engine_set.get_output(). This should be easier to discover!