inception-health / otel-export-trace-action

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Propagate workflow and job attributes into child spans

juliahw opened this issue · comments

Hello and thank you for maintaining this project! I'm using Honeycomb to store and visualize traces. Honeycomb's query model works by returning events that match the given search attributes. Those search attributes must all match one event. That is, there is no concept of searching over all the events in a trace.

Currently, workflow and job attributes are missing from lower-level spans. Thus I'm unable to answer questions such as:

  • In my ci workflow, how often does my job named test fail on main? (Not answerable because github.head_ref is only available on workflow-level spans)
  • In my ci workflow, how often does my job named build fail due to a step named install-dependencies? (Not answerable because github.job.name is only available on job-level spans)

Are you open to a PR that would add these attributes? I am imagining some relatively simple modifications to traceWorkflowRunJob and traceWorkflowRunStep!

I definitely empathize with this pain. With OpenTelemetry the only attributes that propagate to children are resource attributes. So I'm hesitant to manually keep track of parent attributes as a convenience for querying. I feel like this goes against the otel spec. So if there are resource attributes that are missing I welcome a PR to add those. I also welcome being shown my understanding of otel is flawed too! I welcome brainstorming on what might be the best way to solve the problem as I also use Honeycomb!

Here is what I propose as a compromise to the core problem you face. We pass down important parent identifiers and include those as attributes on the children but we don't include all of the parent attributes.

  • workflow id/name
  • job id/name
  • step id/name

Anything else I think starts to fight against the otel spec. Like we can't pass these attributes across services we can only pass the parent trace. So the spec puts the burden of advanced querying on the vendor and unfortunately, honeycomb doesn't enable joining of spans.

Apologies for the delayed response! I want to make sure I'm understanding correctly. Is it an anti-pattern for a span to contain information passed from its parents? If so, I'm fuzzy on what the advantage is of limiting the amount of information we place in a span. Honeycomb's philosophy (and perhaps this explains their query model) seems to be that the wider the telemetry data, the better.

Separately, one of my colleagues mentioned the Baggage API to me. Does that seem like the golden path for what I'm trying to accomplish?

I haven't seen the Baggage API. Looks interesting! Briefly skimming it I think the spirit of my comment still applies. What is important context that we want to associate with children and what is specific to the parent. That is my primary point of feedback. I wouldn't want to accidentally pass down key parent identifying attributes that would help me identify the parent vs the child. So if we instead focus on the specific attributes you need to pass down and then we can add to it iteratively as we find a need.