Doubt on how to setup freqüency and metric filters

Question

Doubt on how to setup freqüency and metric filters

toni-moreno opened this issue 9 years ago · comments

I am very grateful for your great work.

I'm testing vmstats and it seems a promising solution to me, but there is some leak of documentation on how the tool is gathering information.

Right now we are collecting data from a TEST vCenter with 5 ESX and 5 VM's and default vmstats.properties ( but without filters to see all available metrics)

VCS_HOST=<myhost>
VCS_USER=<myuser>
VCS_PASS=<mypass>
VCS_TAG=vcs_tag
ESX_STATS=true
GRAPHITE_HOST=localhost
GRAPHITE_PORT=2003
GRAPHITE_TAG=vmstats
USE_FQDN=false
SLEEP_TIME=300
CACHED_LOOP_CYCLES=3600
MAX_VMSTAT_THREADS=8
MAX_ESXSTAT_THREADS=4
MAX_GRAPHITE_THREADS=7
SEND_ALL_PERIODS=true
SEND_ALL_ABSOLUTE=true
SEND_ALL_DELTA=true
STAT_EXCLUDES=
DISCONNECT_GRAPHITE_AFTER=0

In this config I can no see which will be the gathering frequency ( I would like 1 metric / minute or 1 metric / 5 minutes) . How can I configure to get metrics only 1 data/ minute for each metric?

Another Question is about metric filtering. I'm my test I'm getting over 15000 metrics for only 5 ESX and 5 VM's.

There is any way to limit to get the most basic metrics instead of getting everything?

I'm executing with -P flag (only once ) and It is getting metrics from 1 hour ago . Is It the expected behavior ?

Nathan Haneysmith · Answer 1 · Sat Mar 07 2015 01:55:28 GMT+0800 (China Standard Time)

SLEEP_TIME is how often vmstats will connect to vCenter to gather metrics. SEND_ALL_PERIODS=true means that when vmstats connects, it will gather the highest resolution data that is available in vCenter and send all the data points since the last run. For us, that is 20s resolution, so every five minutes we gather and send 15 data points for each metric.

Filtering is done with STAT_EXCLUDES, i.e.:
STAT_EXCLUDES=^datastore.$, ^hbr.$

Toni Moreno · Answer 2 · Mon Mar 09 2015 16:20:34 GMT+0800 (China Standard Time)

Hi @tmonk42 , thank you for your fast response.

If I understood correctly to get only 1 metric / minute I should configure.

SLEEP_TIME=60
SEND_ALL_PERIODS=false

And excludes has support support regular expressions, but , for complete metric name?

Lots of thanks, I will test right now.

Toni Moreno · Answer 3 · Mon Mar 09 2015 21:41:40 GMT+0800 (China Standard Time)

After reconfiguration done

SLEEP_TIME=60
SEND_ALL_PERIODS=false

And executed with -D output the output , the debug-gwriter-pool-5-thread-1.log file is still showing metrics each 20 seconds...

What I'm doing wrong?

vmstats.vcs_tag.cluster01.vm.hostname01.datastore.totalWriteLatency.53f727d1-6a1b3ef7-0fb1-8c89a588cb62.average 0 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.datastore.totalWriteLatency.53f727d1-6a1b3ef7-0fb1-8c89a588cb62.average 0 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.datastore.totalWriteLatency.53f727d1-6a1b3ef7-0fb1-8c89a588cb62.average 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.virtualDisk.write.scsi0:0.average 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.mem.zipSaved.latest 0 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.mem.zipSaved.latest 0 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.mem.zipSaved.latest 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.mem.decompressionRate.average 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.mem.swapped.average 0 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.mem.swapped.average 0 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.mem.swapped.average 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.usage.average 12 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.net.bytesRx.4000.average 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.ready.0.summation 2 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.ready.0.summation 3 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.ready.0.summation 2 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.disk.numberReadAveraged.naa_60a980003246686b5624437237486a61.average 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.system.summation 1 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.system.summation 0 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.system.summation 1 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.costop.summation 0 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.costop.summation 0 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.costop.summation 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.ready.summation 2 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.ready.summation 3 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.ready.summation 2 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.virtualDisk.smallSeeks.scsi0:0.latest 0 1425908020
vmstats.vcs_tag.cluster01.vm.hostname01.virtualDisk.smallSeeks.scsi0:0.latest 0 1425908040
vmstats.vcs_tag.cluster01.vm.hostname01.virtualDisk.smallSeeks.scsi0:0.latest 0 1425908060
vmstats.vcs_tag.cluster01.vm.hostname01.cpu.maxlimited.summation 0 1425908020
``

Toni Moreno · Answer 4 · Tue Mar 10 2015 17:06:15 GMT+0800 (China Standard Time)

Hi @tmonk42 after review the code:

https://github.com/Nordstrom/vmstats/blob/master/src/main/java/org/timconrad/vmstats/statsGrabber.java#L179-L202

I've seen that the correct way to setup 1 metric / minute is
with

SLEEP_TIME=60
SEND_ALL_PERIODS=false
SEND_ALL_ABSOLUTE=false
SEND_ALL_DELTA=false

Toni Moreno · Answer 5 · Tue Mar 10 2015 19:55:44 GMT+0800 (China Standard Time)

hi @tmonk42 , can you help me to understand the meaning of the CACHED_LOOP_CYCLES , parameters?
Why is exactly vmstats caching metrics ?
to prevent metric lost when remote graphite is down ?
to prevent metric lost when slow connections ?

Toni Moreno · Answer 6 · Tue Mar 10 2015 23:24:00 GMT+0800 (China Standard Time)

Hi @tmonk42 , after review the code I can see that.

STAT_EXCLUDES exclude all metrics from a group, and should match complete metric group.

https://github.com/Nordstrom/vmstats/blob/master/src/main/java/org/timconrad/vmstats/Main.java#L240-L242

This is , STAT_EXCLUDES can not filter metric names.

lbasavap · Answer 7 · Fri Mar 13 2015 07:42:38 GMT+0800 (China Standard Time)

Hello Toni,
Please take a look at StatsFeeder with GraphiteReceiver plug-in.

Thanks

Lava

Toni Moreno · Answer 8 · Fri Mar 13 2015 13:47:55 GMT+0800 (China Standard Time)

hi @lbasavap , I've already done the metric filtering in the following branch (https://github.com/toni-moreno/vmstats/tree/improved_metric_selection) , but I'm still waiting to merge this PR first (#19).

Why is better gathering data with https://labs.vmware.com/flings/statsfeeder than with vijava sdk ?

How many VM's ESX are you getting data with your https://github.com/lbasavap/GraphiteReceiver ?