Icinga / icingaweb2-module-vspheredb

The easiest way to monitor a VMware vSphere environment.

Home Page:https://icinga.com/docs/vsphere/latest

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Overriding monitoring rules

Virsacer opened this issue · comments

Expected Behavior

When having "Global Monitoring Rules" and "Folder Monitoring Rules" with overlapping settings, the latter should "win".

Current Behavior

I have set "When powered off" to "Do nothing" for VMs globally.
But for one folder I have set it to "Trigger a Critical state".
Unfortunately the global setting is used for VMs in that folder:

  [OK] Power State
   \_ [OK] Virtual Machine has been powered off

Your Environment

  • VMware vCenter®/ESXi™-Version: 7.0.3
  • Version/GIT-Hash of this module: 1.7.1
  • Icinga Web 2 version: 2.11.4
  • Operating System and version: Oracle Linux 7
  • Webserver, PHP versions: PHP 7.3.33

Please give a look to --inspect, does it reflect what you're seeing?

Hi, I did not know that parameter...

  [OK] Power State (--rule ObjectStatePolicy/PowerState)
   trigger_on_poweredOff = "ignore"
   trigger_on_suspended = "critical" (inherited from AlwaysOnFolder)
   trigger_on_unknown = "critical" (inherited from AlwaysOnFolder)
   \_ [OK] Virtual Machine has been powered off

So "trigger_on_poweredOff" comes from Global, but should be overwritten from "AlwaysOnFolder"

mysql --binary-as-hex vspheredb -e 'SELECT * FROM monitoring_rule_set\G'

mysql --binary-as-hex vspheredb -e "SELECT * FROM object WHERE object_name = 'AlwaysOnFolder'\G"
*************************** 1. row ***************************
  object_uuid: 0x
object_folder: vm
     settings: {"ObjectStatePolicy/PowerState/critical_for_uptime_greater_than_days":999,"ObjectStatePolicy/PowerState/trigger_on_poweredOff":"ignore","ObjectStatePolicy/PowerState/warning_for_uptime_greater_than_days":999}
*************************** 2. row ***************************
  object_uuid: 0x499A6581CE425D67B70D22D33CE5DEC1
object_folder: vm
     settings: {"ObjectStatePolicy/PowerState/trigger_on_poweredOff":"critical","ObjectStatePolicy/PowerState/trigger_on_suspended":"critical","ObjectStatePolicy/PowerState/trigger_on_unknown":"critical"}



+------------------------------------+------------------------------------+---------------+----------------+-------------+----------------+-------+------------------------------------+------+
| uuid                               | vcenter_uuid                       | moref         | object_name    | object_type | overall_status | level | parent_uuid                        | tags |
+------------------------------------+------------------------------------+---------------+----------------+-------------+----------------+-------+------------------------------------+------+
| 0x499A6581CE425D67B70D22D33CE5DEC1 | 0x0BD3C813BF9240FF8EE78E6E26FB44D3 | group-v140507 | AlwaysOnFolder | Folder      | gray           |     5 | 0xE3488B78CF4759CA96C2EBC4EEE5C6BD | []   |
+------------------------------------+------------------------------------+---------------+----------------+-------------+----------------+-------+------------------------------------+------+

Looks good to me... strange

There used to be related bugs, but as you're running v1.7.1 - they should all have been fixed. Could you please try to set it to "Trigger a warning" on some other folder between AlwaysOnFolder and your root? Does that change anything?

  [WARNING] Power State (--rule ObjectStatePolicy/PowerState)
   critical_for_uptime_greater_than_days = 999
   trigger_on_poweredOff = "warning" (inherited from Kunden)
   trigger_on_suspended = "critical" (inherited from AlwaysOnFolder)
   trigger_on_unknown = "critical" (inherited from AlwaysOnFolder)
   warning_for_uptime_greater_than_days = 999
   warning_for_uptime_less_than = 900
   \_ [WARNING] Virtual Machine has been powered off

Look like it only happens when a setting is on both root and leaf.
When it is on root and in the middle or only at the leaf it works...

If you remove it on the leaf, and set it one level above - does it then work?

I did some more tests and always set the same value for all three parameters:

When set only on the leaf, it works fine.
As soon as it is set on ANY other level(s), the leaf is ignored.
When it is set on multiple non-leaf levels, the lower (shortest to leaf) levels value wins (but never the leaf itself)

Hi there, adding to this, as I'm perceiving the same issues with the same behaviour (also running v1.7.1);
when configuring "Enabled" to "Please choose" for any setting, the object is monitored as though the setting was enabled. Also, once monitoring thresholds are set on a branch, they are used before the thresholds of a leaf, even when the monitoring is set to "Pleases choose" on a branch or leaf.

Leaf:
image

Next closest branch:
image

Host in the leaf group:

[WARNING] Host System, according configured rules
   [OK] Object State Policy (--rule ObjectStatePolicy/*)
      [OK] Overall VMware Object State (--rule ObjectStatePolicy/VMwareObjectState)
       trigger_on_gray = "warning"
       trigger_on_red = "warning"
       trigger_on_yellow = "warning"
       \_ [OK] Overall VMware status is 'green'
      [OK] Power State (--rule ObjectStatePolicy/PowerState)
       critical_for_uptime_greater_than_days = 600
       critical_for_uptime_less_than = 0
       trigger_on_poweredOff = "ignore"
       trigger_on_suspended = "warning"
       trigger_on_unknown = "unknown"
       warning_for_uptime_greater_than_days = 365
       warning_for_uptime_less_than = 0
       \_ [OK] Host System is powered on
       \_ [OK] System booted 298d 2h ago
   [WARNING] Compute Resource Usage (--rule ComputeResourceUsage/*)
      [OK] CPU Usage (--rule ComputeResourceUsage/CpuUsage)
       critical_if_less_than_percent_free = 10 (inherited from [leaf])
       warning_if_less_than_percent_free = 30 (inherited from [leaf])
       \_ [OK] 3.13 GHz out of 57.5 GHz used, 54.3 GHz (94.54%) free
      [WARNING] Memory Usage (--rule ComputeResourceUsage/MemoryUsage)
       critical_if_less_than_percent_free = 2 (inherited from [leaf])
       threshold_precedence = "best_wins"
       warning_if_less_than_percent_free = 20 (inherited from [closest branch])
       \_ [WARNING] 79.32 GiB out of 511.70 GiB (15.50%) free

Additionally, the --inspect doesn't mention when a setting is inherited from the global setting "All vCenters". I find that somewhat unintuitive, since all other inheritances are shown.

We just stumbled over this issue while trying to set individual limits for one datastore. The behaviour is exactly as it is mentioned some comments above: if there is a setting on the path to the leaf, the settings directly at the leaf are ignored.
Any idea whether there will be a fix in the nearer future?