Log reason for unready model mesh container
Legion2 opened this issue · comments
Leon Kiefer commented
I'm currently debugging constantly unready model mesh containers, which do not directly indicate the problem in the logs. Only through reading the code I discovered abortStartup
, which fails the ready probe without logging the reason
modelmesh/src/main/java/com/ibm/watson/modelmesh/ModelMesh.java
Lines 1311 to 1313 in c4fc041
ModelMesh.abortStartup
indicates an unrecoverable failure, which can only be resolved with an restart of model mesh container.If the readiness probe fails because of
abortStartup
it should be logged, to allow debugging the root cause of the issue.
Alternatively, the Liveness probe of the container should fail to indicate an unrecoverable application failure.
Christian Kadner commented
@Legion2 -- would you like to propose a code change in a PR?
Leon Kiefer commented