workflow logs not available due to elasticsearch sharding problem

Question

workflow logs not available due to elasticsearch sharding problem

iakkus opened this issue 3 years ago · comments

In long-running installations, the logs of newly created workflows become unavailable due to the management service not being able to create their index at elasticsearch.

[1616660753006] [1616660752778628] [2021-03-25 08:25:52.778] [INFO] [admin@management] [Management] [b66afa618d4311ebb8540242ac110003] [Management] [addWorkflow] Creating workflow index: mfnwf-0771dd9c37f76bda93cd25b328c6203e

[1616660753006] [1616660752783148] [2021-03-25 08:25:52.783] [INFO] [admin@management] [Management] [b66afa618d4311ebb8540242ac110003] [Management] [addWorkflow] {'error': {'root_cause': [{'type': 'index_creation_exception', 'reason': 'failed to create index [mfnwf-0771dd9c37f76bda93cd25b328c6203e]'}], 'type': 'validation_exception', 'reason': 'Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;'}, 'status': 400}`

As a result, the retrieval of logs also fails with "index_not_found" exception.

Istemi Ekin Akkus · Answer 1 · Tue May 04 2021 23:08:17 GMT+0800 (China Standard Time)

Not sure whether this is in our scope, or it is a general elasticsearch problem. I've seen workarounds, whereby the amount of shards were increased and/or old logs were deleted from the system.

We have started deleting workflow logs when the workflow is removed. Not sure whether this was a problem specific to my setup (i.e., long-running installation before the workflow log removal was happening) and/or can be replicated easily anymore.

Closing as non-issue. If somebody else runs into a similar problem, please reopen.

At this point, it is