m3db / m3

M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform

Home Page:https://m3db.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

metrics will be lost if writing in nanosecond

jerryzhen01 opened this issue · comments

Filing M3 Issues

General Issues

General issues are any non-performance related issues (data integrity, ease of use, error messages, configuration, documentation, etc).

Please provide the following information along with a description of the issue that you're experiencing:

  1. What service is experiencing the issue? (M3Coordinator, M3DB, M3Aggregator, etc)
    Coordinator and M3DB

  2. What is the configuration of the service? Please include any YAML files, as well as namespace / placement configuration (with any sensitive information anonymized if necessary).

namespace:

{
  "registry": {
    "namespaces": {
      "default": {
        "aggregationOptions": {
          "aggregations": [
            {
              "aggregated": false,
              "attributes": null
            }
          ]
        },
        "bootstrapEnabled": true,
        "cacheBlocksOnRetrieve": false,
        "cleanupEnabled": true,
        "coldWritesEnabled": false,
        "extendedOptions": null,
        "flushEnabled": true,
        "indexOptions": {
          "blockSizeDuration": "30m0s",
          "enabled": true
        },
        "repairEnabled": false,
        "retentionOptions": {
          "blockDataExpiry": true,
          "blockDataExpiryAfterNotAccessPeriodDuration": "5m0s",
          "blockSizeDuration": "30m0s",
          "bufferFutureDuration": "2m0s",
          "bufferPastDuration": "10m0s",
          "futureRetentionPeriodDuration": "0s",
          "retentionPeriodDuration": "12h0m0s"
        },
        "runtimeOptions": null,
        "schemaOptions": null,
        "snapshotEnabled": true,
        "stagingState": {
          "status": "READY"
        },
        "writesToCommitLog": true
      }
    }
  }
}
  1. How are you using the service? For example, are you performing read/writes to the service via Prometheus, or are you using a custom script?
    I am using a HTTP POST request to write and read data from Coordinator.

  2. Is there a reliable way to reproduce the behavior? If so, please provide detailed instructions.
    Yes.

Issue description

I am using the Coordinator HTTP API to write and read metrics.
My test metrics are in nanosecond, but the minimum time unit for querying seems to be millisecond.
As a consequence, if I insert multiple metrics within the same millisecond, only the last metric showed up in the qeury. (the rest are gone...)

reproduce the issue

below script insert two metircs into m3db, and then do a range query.


curl -X POST http://localhost:7201/api/v1/json/write -d '{
    "tags":
      {
        "__name__": "test",
        "type": "3"
      },
      "timestamp": "1634799781.490007440",
      "value": 140
    }'


curl -X POST http://localhost:7201/api/v1/json/write -d '{
    "tags":
      {
        "__name__": "test",
        "type": "3"
      },
      "timestamp": "1634799781.490007450",
      "value": 150
    }'


curl -X POST -G http://localhost:7201/api/v1/query_range -d 'query=sum_over_time(test{type='\''3'\''}[1s])' -d start=1634799781.00 -d end=1634799781.99 -d step=1ms | jq .

from the result, only the metric with "value": 150 returned from the qeury. the one with "value": 140 is gone.

# curl -X POST -G http://localhost:7201/api/v1/query_range -d 'query=sum_over_time(anomalies{type='\''3'\''}[1s])' -d start=1634799781.00 -d end=1634799781.99 -d step=1ms | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11564    0 11564    0     0  1882k      0 --:--:-- --:--:-- --:--:-- 1882k
{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "type": "3"
        },
        "values": [
          [
            1634799781.49,
            "150"
          ],
          [
            1634799781.491,
            "150"
          ],
...

What's the desired functionality here if your step size is 1ms?

@wesleyk thanks for looking into this issue.

The reason why I use 1ms for the step size is just for testing(previously I wanted to know what is the minimum time unit can be used for the step)

In this case, I guess the step size doesn't matter? below is another test case where I change the step size to 1s.
You can see that I inserted two datapoints(timestamp in nanoseconds) with the value of 140 and 150, and then do a sum_over_time. What I expected is a value of 290 for the query result. But what we got is 150. So looks like M3 only keeps the last datapoint in this case?

curl -X POST http://localhost:7201/api/v1/json/write -d '{
    "tags":
      {
        "__name__": "test",
        "type": "3"
      },
      "timestamp": "1637187142.017371443",
      "value": 140
    }'


curl -X POST http://localhost:7201/api/v1/json/write -d '{
    "tags":
      {
        "__name__": "test",
        "type": "3"
      },
      "timestamp": "1637187142.017371444",
      "value": 150
    }'



#   curl -X POST -G http://localhost:7201/api/v1/query_range -d 'query=sum_over_time(test{type='\''3'\''}[60s])' -d start=1637187142 -d end=1637187143 -d step=1s | jq
.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   116  100   116    0     0  19333      0 --:--:-- --:--:-- --:--:-- 19333
{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "type": "3"
        },
        "values": [
          [
            1637187143,
            "150"
          ]
        ]
      }
    ]
  }
}