dtube / avalon

Blockchain for social distribution

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Websockets hanging

skzap opened this issue · comments

It seems some people have an error around the node's consensus hanging rarely and randomly. According to logs it's because the node never reaches the 2/3+ threshold to pass consensus. I suppose that some websockets are left hanging (as if the internet cord was broke) and not properly terminated and re-opened, which is kind of crucial in terms of consensus. If a node thinks an active leader is online, but isn't validating blocks, then we end up in this situation where an observer node just gets stuck, and an active leader node forks on it's own chain.

I'm creating a websocket-terminate branch where it's already tracking the websockets health. Can see data in /peers from API e.g. curl http://localhost:3001/peers | jq ".[].lastMessageTime"

I will try to verify the theory is correct (some hanged websockets should show up in /peers with very long lastMessageTime.

Seems like it got better following the memory-fix branch merged into master. Closing for now but might be an issue in the future again, maybe it was related to a bad node that's now gone from consensus.