bmoscon / cryptostore

A scalable storage service for cryptocurrency data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to restart container when one of the processes fails? How to track service health?

anovv opened this issue · comments

I'm encountering an issue where if one of the cryptostore processes fails (e.g. aggregator or collector) the container still keeps running without being restarted. What is the best practice if I want to restart a container if one of the processes fails (e.g. if we use Kubernetes)?

For example, in Kubernetes there is a tool called liveliness probe, which essentially calls a script/endpoint in a container to check it's health and wether or not it should be restarted. Should there be something similar for cryptostore (i.e. a small server reporting readiness/liveliness of the service).

Another scenario I encountered is collector and other processes running fine, but redis not receiving any messages due to an exception in feed handler. Should we have a check for a message arrival rate for each key in redis?

In general, what should be the guidelines to track the service health?