allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Home Page:https://clear.ml/docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AWS Self Hosted - clearml-elastic not working

hardikdava opened this issue · comments

I have self hosted clearml server on aws using available ec2 AWS-AMIs (clearml-server-1.12.0-393-110). I exposed the necessary ports but the webapp is not working. It is reporting as following:

SERVER UNAVAILABLE
The ClearML server is currently unavailable.
Please try to reload this page in a little while.
If the problem persists, verify your network connection is working and check the ClearML server logs for possible errors

Then I checked if docker is working fine or not and I found that clearml-elastic is keep restaring and not working properly.

Can anyone help me out with this issue?

Hi @hardikdava , can you share the clearml-elastic container logs? (use sudo docker logs clearml-elastic)

@jkhenning Thanks for the reply. I dig into logs and found that it was an issue of memory. I chose t2 instance and it was lower than required specifications. So I am updating to t3.large and try to host it again.