opensearch-project / data-prepper

Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.

Home Page:https://opensearch.org/docs/latest/clients/data-prepper/index/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow trace group for partial traces

KarstenSchnitter opened this issue · comments

Is your feature request related to a problem? Please describe.
DataPrepper does an aggregation of all spans with the same trace id. If encounters a span with a null parent span id, it assigns its name as trace group to all spans. This allows classification in the OpenSearch Dashboards observability plugin. However, this approach fails, if the global parent span does not arrive in time or at all. Consider the following situation:

otel-partial-trace drawio

In this picture, DataPrepper receives all coloured spans. It does not receive the gray spans. This might be because they are created in another system outside of the reach of the observability infrastructure the DataPrepper instance belongs to. This can be another vendor or a client system where the coloured spans are generated within a SaaS solution.

Currently, DataPrepper will not create a trace group entry for the spans, since the global trace parent is never received.

Describe the solution you'd like
It would be great, if in that case DataPrepper would follow the connection along the parent span ids until it can no longer resolve the parent. If this leads to a unique span, this span should be used as the trace parent instead of the original global trace parent.

The picture shows a conflict situation, where no unique parent can be determined. In that case, no trace group should be issued, keeping the current behaviour.

Additional context
For the implementation, this feature could be an option in the OTelTraceRawProcessor where the detection of a parent span needs to be changed.

Alternatively, in the OTelTraceGroupProcessor the search query could be changed.

It would also be possible to create a new processor or action in the aggregate processor, that fills in empty trace groups if possible.