Metrics Scale Up Details?

Question

Metrics Scale Up Details?

lpsantil opened this issue 8 years ago · comments

One my former customers is having a hard time enabling HA in metrics by following the guidance in the OCP docs and some other things we found here in the issues list.

Specifically, the scale up command given in the OCP docs does not seem accurate. It doesn't scale up the cassandra pod as the docs imply by their mentioning of the storage requirements. Manually scaling up the cassandra pods is not effective. Deploying with CASSANDRA_NODES=3 from the template doesn't bring up a running metrics instance. The one thing we haven't tried just yet is the ansible installer method. Maybe the installer has some magic in there? Executing the following after scaling up 3 cassandra nodes fails

oc exec <cassandra_pod> csql -e "ALTER KEYSPACE hawkular_metrics WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;"
oc exec <cassandra_pod> nodetool repair -full

There's also concerns that the oc exec commands have an ephemeral effect and would need to be replicated in a recovery scenario.

Any ideas? I have a Portal ticket number with details for those with access.

lpsantil commented 7 years ago

Ping?

OpenShift Bot · Answer 1 · Tue Aug 11 2020 10:37:48 GMT+0800 (China Standard Time)

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

OpenShift Bot · Answer 2 · Thu Sep 10 2020 12:28:01 GMT+0800 (China Standard Time)

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

OpenShift Bot · Answer 3 · Sat Oct 10 2020 14:16:31 GMT+0800 (China Standard Time)

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

OpenShift CI Robot · Answer 4 · Sat Oct 10 2020 14:16:47 GMT+0800 (China Standard Time)

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.