Migration failures are ignored
dalazx opened this issue · comments
Describe the bug
Migration failures are ignored and in spite of that the service continues to be up. K8S reports that the corresponding pods are Ready
but it is not the case.
{"timestamp":"2021-09-09T10:40:00.476Z","@version":"1","message":"migration failed","logger_name":"com.namely.chiefofstate.ServiceMigrationRunner$","thread_name":"ChiefOfStateSystem-akka.actor.default-dispatcher-15","level":"ERROR","level_value":40000,"stack_trace":"org.postgresql.util.PSQLException: ERROR: relation \"public.journal\" does not exist\n Position: 91\n\tat org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)\n\tat org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)\n\tat org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)\n\tat org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:481)\n\tat org.postgresql.jdbc.PgStatement.execute(PgStatement.java:401)\n\tat org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)\n\tat org.postgresql.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:153)\n\tat com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)\n\tat com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)\n\tat slick.jdbc.StatementInvoker.results(StatementInvoker.scala:39)\n\tat slick.jdbc.StatementInvoker.iteratorTo(StatementInvoker.scala:22)\n\tat slick.jdbc.StreamingInvokerAction.emitStream(StreamingInvokerAction.scala:28)\n\tat slick.jdbc.StreamingInvokerAction.emitStream$(StreamingInvokerAction.scala:26)\n\tat slick.jdbc.JdbcActionComponent$QueryActionExtensionMethodsImpl$$anon$2.emitStream(JdbcActionComponent.scala:214)\n\tat slick.jdbc.JdbcActionComponent$QueryActionExtensionMethodsImpl$$anon$2.emitStream(JdbcActionComponent.scala:214)\n\tat slick.basic.BasicBackend$DatabaseDef$$anon$4.run(BasicBackend.scala:342)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n","service":"payments-cos","powered_by":"chiefofstate"}
the migration actor was terminated due to the failure above
{"timestamp":"2021-09-09T10:40:00.479Z","@version":"1","message":"Singleton actor [akka://ChiefOfStateSystem/system/singletonManagerCosServiceMigrationRunner/CosServiceMigrationRunner] was terminated","logger_name":"akka.cluster.singleton.ClusterSingletonManager","thread_name":"ChiefOfStateSystem-akka.actor.default-dispatcher-7","level":"INFO","level_value":20000,"akkaAddress":"akka://ChiefOfStateSystem@10.3.140.126:25520","sourceThread":"ChiefOfStateSystem-akka.actor.internal-dispatcher-5","akkaSource":"akka://ChiefOfStateSystem@10.3.140.126:25520/system/singletonManagerCosServiceMigrationRunner","sourceActorSystem":"ChiefOfStateSystem","akkaTimestamp":"10:40:00.479UTC","tags":["akkaClusterSingletonTerminated"],"service":"payments-cos","powered_by":"chiefofstate"}
kubectl describe po ...
...
Status: Running
...
Ready: True
Restart Count: 0
...
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
To Reproduce
Steps to reproduce the behavior:
- Deploy CoS from scratch with
COS_MIGRATIONS_INITIAL_VERSION
set to0
to cause the failure; - Wait until all the corresponding pods are running;
- Check the pod states.
Expected behavior
The service should fail fast with a non-zero exit code.