beacon-biosignals / Ray.jl

Julia API for Ray

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Connecting to GCSClient without a local raylet hangs

glennmoy opened this issue · comments

Follow up to comment thread:

The issue is that the Connect(client) call returns Status::OK irrespective of whether the GCS Server has been initiated

It first reports after 5 seconds that it can't connect, then after a minute kills the session with an EXIT_FAILURE.
Again these are set by RayConfig params.

If the client does not exist then then the thread executing the server (I think) throws the error which only gets reported but not caught in the Julia REPL

https://github.com/ray-project/ray/blob/cde6e887cbb21a9cae2632e3e4b883d913d38a05/src/ray/rpc/gcs_server/gcs_rpc_client.h#L212-L216

Unfortunately the gcs_is_down_ field is private, however there is a way to check if the server is alive that uses a callback

However, I don't think it's worth directly implementing this. The timeout should take care of things it's just that the error won't be nicely caught/reported in Julia but we can add that as a follow up.

Originally posted by @glennmoy in #211 (comment)