m3db / m3

M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform

Home Page:https://m3db.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Disconnected Traces between M3Query and M3DBNode

albertteoh opened this issue · comments

We came across a situation where a single request to an M3 endpoint resulted in two traces being created, when we expected a single trace. An example of traces resulting from a request to the query_range endpoint looks like the following and I've circled what I believe should be the continuation points of each trace:

Screen Shot 2021-01-28 at 8 19 53 pm
Screen Shot 2021-01-28 at 8 22 49 pm

I suspect the broken trace is caused by trace context being lost due to it not being passed down, and the creation of a new Context within an async operation on this line of code: https://github.com/m3db/m3/blob/master/src/dbnode/client/host_queue.go#L879

  1. What service is experiencing the issue? (M3Coordinator, M3DB, M3Aggregator, etc)

This example is specific to M3Query -> M3DB, but could apply to other services.

  1. What is the configuration of the service? Please include any YAML files, as well as namespace / placement configuration (with any sensitive information anonymized if necessary).

m3query.yml

listenAddress: 0.0.0.0:7202

tracing:
  backend: jaeger
  jaeger:
    sampler:
      type: remote

clusters:
  - namespaces:
      - namespace: default
        type: unaggregated
        retention: 48h
    client:
      config:
        service:
          env: default_env
          zone: embedded
          service: m3db
          cacheDir: /var/lib/m3kv
          etcdClusters:
            - zone: embedded
              endpoints:
                - 127.0.0.1:2379
metrics:
  scope:
    prefix: "query"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:7204 # until https://github.com/m3db/m3/issues/682 is resolved
  sanitization: prometheus
  samplingRate: 1.0
  extended: none

m3dbnode.yml

coordinator:
  tracing:
    backend: jaeger
    jaeger:
      sampler:
        type: const
        param: 1
db:
  tracing:
    backend: jaeger
    jaeger:
      sampler:
        type: remote
  1. How are you using the service? For example, are you performing read/writes to the service via Prometheus, or are you using a custom script?

Custom script to read and write.

write_sample_data.sh

curl -X POST http://localhost:7201/api/v1/json/write -d '{
  "tags":
    {
      "__name__": "third_avenue",
      "city": "boston",
      "checkout": "1"
    },
    "timestamp": '\"$(date "+%s")\"',
    "value": 3347.26
}'

# Insert tagged data
curl http://localhost:9003/writetagged -s -X POST -d '{
  "namespace": "default",
  "id": "foo",
  "tags": [
    {
      "name": "__name__",
      "value": "user_login"
    },
    {
      "name": "city",
      "value": "new_york"
    },
    {
      "name": "endpoint",
      "value": "/request"
    }
  ],
  "datapoint": {
    "timestamp":'"$(date +"%s")"',
    "value": 42.123456789
  }
}'

query_sample_data.sh

curl -X "POST" -G "http://localhost:7202/api/v1/query_range" \
  -d "query=third_avenue" \
  -d "start=$( date -v -45S +%s )" \
  -d "end=$( date +%s )" \
  -d "step=5s" | jq .
  1. Is there a reliable way to reproduce the behavior? If so, please provide detailed instructions.
    1. Start Jaeger all-in-one. I personally ran each component from source as I wanted to filter out Node::health traces.

    2. Run m3dbnode:

      make m3dbnode
      
      sudo ./bin/m3dbnode -f m3dbnode.yml
      
    3. Run

      make m3query
      
      sudo ./bin/m3query -f md3query.yml
      
    4. Write some data:

      ./write_sample_data.sh
      
    5. Query the data

      ./query_sample_data.sh
      
    6. Search for m3query traces in Jaeger: http://localhost:16686. Results in two traces for the single query:
      Screen Shot 2021-01-28 at 7 45 33 pm

    7. Drilling into each trace, we see that one trace is a continuation of the other as in the screenshots at the top of this Issue.

@arnikola -- any thoughts?

@gibbscullen this is a real pain point with an easy fix
How can we make progress with it?

@nir-logzio -- we plan to look into this, however, feel free to make a contribution or suggestion in the meantime. We will be happy to review.

I actually had a quick go at it; the fix is conceptually simple: copy the original context over to each successive function call.

However, the line of code I highlighted in the description is fairly high up in the call stack and there are a number of calls below it that also don't pass context. This led to a fan-out of changes to other functions needing context to be passed in (and so on), turning into a bit of a mess. I decided to discontinue my efforts at this point.

It's quite likely I was going about it the wrong way. We'd be happy to contribute, but some guidance would be appreciated; especially on whether there is a better approach than my attempt described above.

@albertteoh - thanks for the update! Would be great if you were able to contribute - we'd be happy to review / provide guidance.

Thanks @gibbscullen, would anyone be able to provide guidance based on my approach above? i.e. was it the right way to go about fixing the problem or is there a better approach?

@albertteoh this PR propagates the context correctly, although I'm not sure if it will pass along the trace ID. Technically opentracing should do the right thing but have not tested it with the PR: #3125

Thanks @robskillington, I believe opentracing should do the right thing. Looking forward to the PR being merged and released for us to try out!

Closing since #3125 has been merged