allenai / tango

Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.

Home Page:https://ai2-tango.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Beaker Executor dies when pip temporarily loses connection

dirkgr opened this issue Β· comments

πŸ› Describe the bug

https://beaker.org/ex/01GCDPHK27WET95H1CEY9DHQVV/tasks/01GCDPHK496C515GYCNVWAK3QA/job/01GCDPHPNYC4FGD6B703S0JDPV

2022-09-08T04:45:49.637609540Z Traceback (most recent call last):
2022-09-08T04:45:49.637614278Z   File "/opt/conda/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
2022-09-08T04:45:49.637618334Z     yield
2022-09-08T04:45:49.637621543Z   File "/opt/conda/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 519, in read
2022-09-08T04:45:49.637625659Z     data = self._fp.read(amt) if not fp_closed else b""
2022-09-08T04:45:49.637628946Z   File "/opt/conda/lib/python3.9/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 62, in read
2022-09-08T04:45:49.637632552Z     data = self.__fp.read(amt)
2022-09-08T04:45:49.637635690Z   File "/opt/conda/lib/python3.9/http/client.py", line 463, in read
2022-09-08T04:45:49.637639449Z     n = self.readinto(b)
2022-09-08T04:45:49.637642727Z   File "/opt/conda/lib/python3.9/http/client.py", line 507, in readinto
2022-09-08T04:45:49.637659807Z     n = self.fp.readinto(b)
2022-09-08T04:45:49.637663204Z   File "/opt/conda/lib/python3.9/socket.py", line 704, in readinto
2022-09-08T04:45:49.637666941Z     return self._sock.recv_into(b)
2022-09-08T04:45:49.637670118Z   File "/opt/conda/lib/python3.9/ssl.py", line 1241, in recv_into
2022-09-08T04:45:49.637673963Z     return self.read(nbytes, buffer)
2022-09-08T04:45:49.637677371Z   File "/opt/conda/lib/python3.9/ssl.py", line 1099, in read
2022-09-08T04:45:49.637680768Z     return self._sslobj.read(len, buffer)
2022-09-08T04:45:49.637684186Z ConnectionResetError: [Errno 104] Connection reset by peer

Does pip have retry options?

Versions

asd

You can build your own image with your big dependencies already installed to avoid so many pip downloads at runtime. Or mount an NFS directory to /root/.cache on your image so that pip can reuse the same cache across Beaker jobs.

Yes, reusable cache is in my future. But also, these workflows need to be easy for someone who isn't me.

Does mounting an nfs directory work correctly with permissions?

Yeup. Just open the permissions on the directory all the way to be sure. That's what I've been doing and it's working great.

This did not work, but it had nothing to do with Tango, so I'm closing this.