Opentelemetry Kafka consumer doesnt inject context in headers
kokikathir opened this issue · comments
Describe the bug
In Opentracing, In consumer interceptor, consumer span context is injected into the header.
https://github.com/opentracing-contrib/java-kafka-client/blob/master/opentracing-kafka-client/src/main/java/io/opentracing/contrib/kafka/TracingKafkaUtils.java#L138
But with Opentelemetry, it is not injected.
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/kafka/kafka-clients/kafka-clients-2.6/library/src/main/java/io/opentelemetry/instrumentation/kafkaclients/v2_6/KafkaTelemetry.java#L292
Steps to reproduce
Please check the above code
Expected behavior
Span context to be injected in headers
Actual behavior
Span context not being injected in headers
Javaagent or library instrumentation version
main
Environment
JDK:
OS:
Additional context
No response
You could extract the context from the original context propagation header.
Context context = openTelemetry.getPropagators().getTextMapPropagator().extract(
Context.root(), record, new TextMapGetter<ConsumerRecord<?, ?>>() {
@Override
public Iterable<String> keys(ConsumerRecord<?, ?> consumerRecord) {
return StreamSupport.stream(consumerRecord.headers().spliterator(), false)
.map(Header::key)
.collect(Collectors.toList());
}
@Nullable
@Override
public String get(@Nullable ConsumerRecord<?, ?> consumerRecord, String key) {
Header header = consumerRecord.headers().lastHeader(key);
if (header == null) {
return null;
}
byte[] value = header.value();
if (value == null) {
return null;
}
return new String(value, StandardCharsets.UTF_8); }
});
Span span = Span.fromContext(context);
if (span.getSpanContext().isValid()) {
System.err.println(span.getSpanContext().getTraceId() + " " + span.getSpanContext().getSpanId());
}
when using java agent you can replace openTelemetry
with GlobalOpenTelemetry.get()
. Would that work for you?
I also see issues when trying to migrate from opentracing to opentelemetry for projects that use reactor-kafka.
- seems related to #9771
This issue seems to arise because in reactor-kafka the KafkaReceiver.receive (or ReactiveKafkaConsumerTemplate) returns a Flux
(meaning there is a single reactor context for the entire flux - not per record). Additionally with project reactor, code execution may use different schedulers and threads (so the active span from the thread context is not always accurate). This means if I have code like
kafkaReceiver.receive()
.flatMap(consumerRecord -> {
// some code that instruments or generates a nested span
})
Then for the nested span, the span information in the reactor context or the active span (in the thread context) may not be accurately referring to the consumer span
- the only way to access (as far as I can tell) the consumer span is from the headers
This works for opentracing, because the interceptor writes the consumer span into the headers.
But for opentelemetry, the consumer span is not written to the headers.
- Currently for open telemetry, if we try to read the parent span from the headers, we get the original producer span that published the message to the kafka topic (not the consumer span)
As an example, with open tracing if one application publishes to a topic, and another application consumes and then has some nested operation (where the nested operation could be a custom application span, a downstream rest service call, a publish to another kafka topic, ...) then I would expect a trace similar to
To_topic (producer span for trace 1)
└─ From_topic (consumer span for trace 1)
└─ nested (nested application span for trace 1)
- This is currently possible by using extractSpanContext to extract the consumer span to use as the parent of the nested span
For a similar example for open tracing with 2 traces, I would expect traces similar to
topic send (producer span for trace 1)
└─ topic process (consumer span for trace 1)
└─ nested (nested application span for trace 1)
topic send (producer span for trace 2)
└─ topic process (consumer span for trace 2)
└─ nested (nested application span for trace 2)
But if I use the active span as the parent of the nested spans, I can end up with spans referring to parent spans for another trace
topic send (producer span for trace 1)
└─ topic process (consumer span for trace 1)
└─ nested (nested application span for trace 1)
└─ nested (nested application span for trace 2)
topic send (producer span for trace 2)
└─ topic process (consumer span for trace 2)
And if I use the headers to determine the parent of the nested spans, the parent then refers to the producer span (since the consumer never writes to the header)
topic send (producer span for trace 1)
├─ topic process (consumer span for trace 1)
└─ nested (nested application span for trace 1)
topic send (producer span for trace 2)
├─ topic process (consumer span for trace 2)
└─ nested (nested application span for trace 2)
Is there another way to achieve the desired outcome of the following, or is an enhancement needed to opentelemetry?
topic send (producer span for trace 1)
└─ topic process (consumer span for trace 1)
└─ nested (nested application span for trace 1)
topic send (producer span for trace 2)
└─ topic process (consumer span for trace 2)
└─ nested (nested application span for trace 2)
This has been automatically marked as stale because it has been marked as needing author feedback and has not had any activity for 7 days. It will be closed automatically if there is no response from the author within 7 additional days from this comment.
@bnigh thanks now I understand that the ask is to add the consumer span (not the producer) info into the header. If I understand correctly opentracing seems to overwrite the tracing headers that originally had producer span so that they contain the consumer span. I think we probably shouldn't do that, but we could write the tracing info for the consumer span in w3c format to some other header like otel-consumer
. Need to consider whether this should also be done in other messaging instrumentations.
with project reactor, code execution may use different schedulers and threads (so the active span from the thread context is not always accurate)
the java agent does try to automatically propagate context for project reactor code, can you narrow down more specifically where the context propagation is failing?
@laurit
That is correct. Open tracing is appending the tracing header, but open telemetry does not.
the java agent does try to automatically propagate context for project reactor code, can you narrow down more specifically where the context propagation is failing?
Please see this code from my previous comment where we use KafkaReceiver from reactor kafka
This issue seems to arise because in reactor-kafka the KafkaReceiver.receive (or ReactiveKafkaConsumerTemplate) returns a
Flux
(meaning there is a single reactor context for the entire flux - not per record). Additionally with project reactor, code execution may use different schedulers and threads (so the active span from the thread context is not always accurate). This means if I have code like
kafkaReceiver.receive() .flatMap(consumerRecord -> { // (a) some code that instruments or generates a nested span })
- Within this example, the correct consumer span is not accessible at
(a)
from the active context nor the reactive context - I am not familiar with how the open telemetry library propagates the context so cannot narrow down more than that without digging into open telemetry library code
I believe a full reproducible example was shared in #9771