jitlogic / zorka

Sophisticated monitoring agent for Java

Home Page:http://zorka.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Active agent items are flapping between supported/not supported state

Art3mK opened this issue · comments

Hi,

I've got strange behavior with active checks, item status is flapping between supported/not supported state. Relevant log records:
zabbix server log:

12361:20160426:121956.557 trapper got '{"request":"active checks","host_metadata":"","port":10055,"host":"ipa-test-mule-02.example.com"}'
12361:20160426:121956.557 In send_list_of_active_checks_json()
12361:20160426:121956.557 In get_hostid_by_host() host:'ipa-test-mule-02.example.com'
12361:20160426:121956.557 query [txnlev:0] [select hostid,status,tls_accept,tls_issuer,tls_subject,tls_psk_identity from hosts where host='ipa-test-mule-02.example' and status in (0,1) and flags<>2 and proxy_hostid is null]
12361:20160426:121956.558 End of get_hostid_by_host():SUCCEED
12361:20160426:121956.558 query [txnlev:0] [select itemid from items where type=7 and flags<>2 and hostid=10110]
12361:20160426:121956.558 In substitute_key_macros() data:'jvm.fdutil[]'
12361:20160426:121956.558 End of substitute_key_macros():SUCCEED data:'jvm.fdutil[]'
12361:20160426:121956.558 send_list_of_active_checks_json() sending [{"response":"success","data":[{"key":"jvm.fdutil[]","delay":30,"lastlogsize":0,"mtime":0}]}]
12361:20160426:121956.558 End of send_list_of_active_checks_json():SUCCEED

everything looks ok. then again in zabbix server log:

12359:20160426:122008.863 trapper got '{"request":"active checks","host":"ipa-test-mule-02.example.com"}'
12359:20160426:122008.863 In send_list_of_active_checks_json()
12359:20160426:122008.863 In get_hostid_by_host() host:'ipa-test-mule-02.example.com'
12359:20160426:122008.863 query [txnlev:0] [select hostid,status,tls_accept,tls_issuer,tls_subject,tls_psk_identity from hosts where host='ipa-test-mule-02.example.com' and status in (0,1) and flags<>2 and proxy_hostid is null]
12359:20160426:122008.863 End of get_hostid_by_host():SUCCEED
12359:20160426:122008.864 query [txnlev:0] [select itemid from items where type=7 and flags<>2 and hostid=10110]
12359:20160426:122008.864 In substitute_key_macros() data:'jvm.fdutil[]'
12359:20160426:122008.864 End of substitute_key_macros():SUCCEED data:'jvm.fdutil[]'
12359:20160426:122008.864 send_list_of_active_checks_json() sending [{"response":"success","data":[{"key":"jvm.fdutil[]","delay":30,"lastlogsize":0,"mtime":0}]}]
12359:20160426:122008.864 End of send_list_of_active_checks_json():SUCCEED
12359:20160426:122008.864 __zbx_zbx_setproctitle() title:'trapper #1 [processed data in 0.001284 sec, waiting for connection]'
12376:20160426:122008.879 __zbx_zbx_setproctitle() title:'self-monitoring [processing data]'
12376:20160426:122008.879 In collect_selfmon_stats()
12376:20160426:122008.879 End of collect_selfmon_stats()
12376:20160426:122008.879 __zbx_zbx_setproctitle() title:'self-monitoring [processed data in 0.000064 sec, idle 1 sec]'
12360:20160426:122009.867 __zbx_zbx_setproctitle() title:'trapper #2 [processing data]'
12360:20160426:122009.867 trapper got '{"request":"agent data","data":[{"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","value":"Unsupported item key.","state":1,"clock":1461662409,"ns":743142094}],"clock":1461662410,"ns":744940745}'
12360:20160426:122009.867 In recv_agenthistory()
12360:20160426:122009.867 In process_hist_data()
12360:20160426:122009.867 In process_mass_data()
12360:20160426:122009.867 End of process_mass_data()
12360:20160426:122009.867 End of process_hist_data():SUCCEED
12360:20160426:122009.867 In zbx_send_response()
12360:20160426:122009.867 zbx_send_response() '{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000107"}'
12360:20160426:122009.867 End of zbx_send_response():SUCCEED
12360:20160426:122009.867 End of recv_agenthistory()
12360:20160426:122009.867 __zbx_zbx_setproctitle() title:'trapper #2 [processed data in 0.000386 sec, waiting for connection]'
12370:20160426:122009.878 query [txnlev:1] [begin;]
12370:20160426:122009.878 In DCmass_update_items()
12370:20160426:122009.878 item "ipa-test-mule-02.example.com:jvm.fdutil[]" became not supported: Unsupported item key.
12370:20160426:122009.878 In DCadd_nextcheck()
12370:20160426:122009.878 End of DCadd_nextcheck()
12370:20160426:122009.878 query [txnlev:1] [update items set state=1,error='Unsupported item key.' where itemid=24048;]
12370:20160426:122009.879 End of DCmass_update_items()

looks like zorka requested list of active check again, zabbix server sended it, but then second later zorka answered that item is not supported? However, there are no info about that in zorka log:

2016-04-26 12:19:57 DEBUG ZabbixUtils Message: '{"request":"active checks","host_metadata":"","port":10055,"host":"ipa-test-mule-02.example.com"}'
2016-04-26 12:19:57 DEBUG ZabbixUtils Message length: 103
2016-04-26 12:19:57 DEBUG ZabbixActiveRequest Zorka send: ZBXDg{"request":"active checks","host_metadata":"","port":10055,"host":"ipa-test-mule-02.example.com"}
2016-04-26 12:19:57 DEBUG ZabbixActiveRequest Zorka get:{"response":"success","data":[{"key":"jvm.fdutil[]","delay":30,"lastlogsize":0,"mtime":0}]}
2016-04-26 12:19:57 DEBUG ZabbixActiveAgent ZabbixActive - schedule Tasks: {response=success, data=[{key=jvm.fdutil[], delay=30, lastlogsize=0, mtime=0}]}
2016-04-26 12:19:57 DEBUG ZabbixActiveAgent ZabbixActive - task: {key=jvm.fdutil[], delay=30, lastlogsize=0, mtime=0}
2016-04-26 12:19:57 DEBUG ZabbixActiveAgent ZabbixActive - new scheduled tasks: 1
2016-04-26 12:19:57 DEBUG ZabbixActiveAgent ZabbixActive - deleted old tasks: 1
2016-04-26 12:20:02 DEBUG ZabbixActiveTask Running task: jvm.fdutil[]
2016-04-26 12:20:02 DEBUG ZabbixActiveTask Translated task: jvm.fdutil()
2016-04-26 12:20:02 DEBUG ZorkaBshAgent Processing request BSH expression: jvm.fdutil()
2016-04-26 12:20:02 DEBUG ZabbixActiveTask Task response: jvm.fdutil[] -> 9.912109375
2016-04-26 12:20:02 DEBUG ZabbixActiveTask Cache size: 1
2016-04-26 12:20:32 DEBUG ZabbixActiveTask Running task: jvm.fdutil[]
2016-04-26 12:20:32 DEBUG ZabbixActiveTask Translated task: jvm.fdutil()
2016-04-26 12:20:32 DEBUG ZorkaBshAgent Processing request BSH expression: jvm.fdutil()
2016-04-26 12:20:32 DEBUG ZabbixActiveTask Task response: jvm.fdutil[] -> 9.912109375
2016-04-26 12:20:32 DEBUG ZabbixActiveTask Cache size: 2
2016-04-26 12:20:57 DEBUG ZabbixActiveSenderTask ZabbixActiveSender run...
2016-04-26 12:20:57 DEBUG ZabbixActiveSenderTask ZabbixActiveSender 2 items cached
2016-04-26 12:20:57 DEBUG ZabbixUtils Message: '{"clock":1461662457,"data":[{"clock":1461662402,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"},{"clock":1461662432,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"}],"request":"agent data"}'
2016-04-26 12:20:57 DEBUG ZabbixUtils Message length: 300
2016-04-26 12:20:57 DEBUG ZabbixActiveRequest Zorka send: ZBXD,{"clock":1461662457,"data":[{"clock":1461662402,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"},{"clock":1461662432,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"}],"request":"agent data"}
2016-04-26 12:20:57 DEBUG ZabbixActiveSenderTask ZabbixActiveSender message sent: {"clock":1461662457,"data":[{"clock":1461662402,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"},{"clock":1461662432,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"}],"request":"agent data"}
2016-04-26 12:20:57 DEBUG ZabbixActiveRequest Zorka get:{"response":"success","info":"processed: 2; failed: 0; total: 2; seconds spent: 0.000128"}
2016-04-26 12:20:57 DEBUG ZabbixActiveSenderTask ZabbixActiveSender 2 items removed from cache
2016-04-26 12:20:57 DEBUG ZabbixActiveSenderTask ZabbixActiveSender finished

so, at 12:20:56 zorka sends data to zabbix and item again became supported:

12360:20160426:122056.506 __zbx_zbx_setproctitle() title:'trapper #2 [processing data]'
12360:20160426:122056.506 trapper got '{"clock":1461662457,"data":[{"clock":1461662402,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"},{"clock":1461662432,"host":"ipa-test-mule-02.example.com","key":"jvm.fdutil[]","lastlogsize":0,"value":"9.912109375"}],"request":"agent data"}'
...
12360:20160426:122056.506 End of process_hist_data():SUCCEED
12360:20160426:122056.506 In zbx_send_response()
12360:20160426:122056.506 zbx_send_response() '{"response":"success","info":"processed: 2; failed: 0; total: 2; seconds spent: 0.000128"}'
12360:20160426:122056.506 End of zbx_send_response():SUCCEED
...
12371:20160426:122056.899 query [txnlev:1] [begin;]
12371:20160426:122056.900 In DCmass_update_items()
12371:20160426:122056.900 item "ipa-test-mule-02.example.com:jvm.fdutil[]" became supported

Are there some bug in zorka, or is it me doing something wrong?

Software versions:
Zabbix server v. 3.0.1, CentOS 7.2.1511
agent side:
zorka v. 1.0.15
jdk 1.8.0_77
CentOS 7.2.1511

zorka.properties:

[root@ipa-test-mule-02 zorka]# egrep -v '^#|^$' zorka.properties
scripts = jvm.bsh, zabbix.bsh, apps/muleesb.bsh
mule.stats = yes
zabbix.active = yes
zabbix.active.server.addr = 192.168.142.16:10051
zabbix.server.addr = 127.0.0.1,192.168.142.16
zabbix.listen.port = 10055
zorka.hostname = ipa-test-mule-02.example.com
http.trace.exclude = ~.*.png, ~.*.gif, ~.*.js, ~.*.css, ~.*.jpg, ~.*.jpeg, ~.*favicon.ico

Thanks!

Some fixes were done by dd00ff and are included in 1.0.16. I'm closing this issue for now, sorry for reacting so late. If problem still occurs please reopen it.