Can't disable st2timersengine
DesireWithin opened this issue · comments
SUMMARY
I followed the documentation(https://docs.stackstorm.com/reference/ha.html#blueprint-box) to install a highly available st2,
I can't disable st2timersengine after I add:
[timer]
enable = False
STACKSTORM VERSION
st2 3.8.0, on Python 3.6.9
OS, environment, install method
Ubuntu 18.04.6, install by apt.
Steps to reproduce the problem
add the configuration, and then restart st2:
root@prod-stackstorm-03:/etc/apt/sources.list.d# tail -n 10 /etc/st2/st2.conf
...
db_name = st2
username = stackstorm
password = XXXX
compressors = zstd
[coordination]
url = redis://:Redis_XXXXX@10.XX.XX.XXX:6379
[timer]
enable = False
root@prod-stackstorm-03:/etc/st2# st2ctl restart
Failed to stop st2chatops.service: Unit st2chatops.service not loaded.
Failed to start st2chatops.service: Unit st2chatops.service not found.
##### st2 components status #####
st2actionrunner PID: 102513
st2actionrunner PID: 102515
st2actionrunner PID: 102517
st2actionrunner PID: 102519
st2actionrunner PID: 102521
st2actionrunner PID: 102523
st2actionrunner PID: 102525
st2actionrunner PID: 102527
st2actionrunner PID: 102529
st2actionrunner PID: 102531
st2api PID: 102539
st2stream PID: 102549
st2auth PID: 102559
st2garbagecollector PID: 102562
st2notifier PID: 102565
st2rulesengine PID: 102569
st2sensorcontainer PID: 102572
st2chatops is not running.
st2timersengine PID: 102577
st2workflowengine PID: 102580
st2scheduler PID: 102583
Expected Results
I expect st2timersengine is not running.
Actual Results
Now I have duplicate rule evaluations.
What coordination backend are you using? I've observed this behaviour with HA setup and am using redis cluster as the coordination backend. Until a fix is found and released I'm using a workaround by putting a simple lock in the workflow that uses the st2 kv store. (This could be adapted to be an action that any workflow can call)
version: 1.0
vars:
- check_lock_delay: 2
tasks:
write_execution_id:
action: st2.kv.set
input:
key: <% ctx(st2).action %>_exec_id
value: <% ctx(st2).action_execution_id %>
next:
- when: <% succeeded() %>
do: wait_to_check_lock
# Delay to allow all nodes to write to the kv store. (Adjust if nodes are heavily loaded and exceed delay)
wait_to_check_lock:
action: core.local
input:
cmd: sleep <% ctx(check_lock_delay) %>
next:
- when: <% succeeded() %>
do: read_execution_id
read_execution_id:
action: st2.kv.get
input:
key: <% ctx(st2).action %>_exec_id
next:
- when: <% succeeded() and result().result = ctx().st2.action_execution_id %>
do: proceed
proceed:
action: core.local
input:
cmd: echo "ONLY A SINGLE WORKFLOW SHOULD REACH HERE"
Yes, I'm using redis as a coordination backend. I am looking for a solution using haproxy to monitor st2timersengine progress.
I use keepalived to make sure only one st2timersengine is running.
MASTER config:
global_defs {
# notification_email {
# your_email@example.com
# }
# notification_email_from keepalived@your_server.com
# smtp_server localhost
# smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_program {
script "/etc/keepalived/check_program.sh"
interval 2
weight -2
fall 2
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens4
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass Sts_platform
}
track_script {
chk_program
}
notify_master "/etc/keepalived/start_program.sh"
notify_backup "/etc/keepalived/stop_program.sh"
}
BACKUP config:
global_defs {
# notification_email {
# your_email@example.com
# }
# notification_email_from keepalived@your_server.com
# smtp_server localhost
# smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_program {
script "/etc/keepalived/check_program.sh"
interval 2
weight -2
fall 2
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens4
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass Sts_platform
}
track_script {
chk_program
}
notify_master "/etc/keepalived/start_program.sh"
notify_backup "/etc/keepalived/stop_program.sh"
}
scripts:
check_program.sh
#!/bin/bash
status=$(systemctl status st2timersengine.service)
if [ $? -eq 0 ]; then
echo "st2timersengine.service is running normally."
exit 0
else
echo "Error: st2timersengine.service is not running normally."
exit 1
fi
start_program.sh
#!/bin/bash
systemctl restart st2timersengine.service
stop_program.sh
#!/bin/bash
systemctl stop st2timersengine.service