elastic / elastic-otel-dotnet

Elastic OpenTelemetry .NET Distribution

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to use elastic features while still using "vanilla" otel?

JanKnipp opened this issue · comments

Hi,

we are long time elastic users but decided a while ago that auto-instrumentation and the elastic nuget APM packages were not the best option for us and decided to fully switch to OTel and live with the shortcomings (i.e. metrics in Kibana APM, etc.).
We have some special configurations and stuff going on in OTel so we would still like to use the "vanilla" configuration way but also be able to get a better integration into elastic that this package may provide.
Is there (apart from manually checking the code) way to integrate your processors, etc. into the OTel configuration or a chance that in the future there might be an extention method that provided this?

something like

services.AddOpenTelemetry()
                .WithTracing(builder =>
                {
                    ...
                    builder.AddElasticTracingExtensions();
                    ...
                })
                .WithMetrics(builder =>
                {
                    ...
                    builder.AddElasticMetricsExtensions();
                    ...
                })
                .WithLogging(....);

This would allow us to still do it on our own (like configure instrumentation, exporters, etc) but still get some of the goodies that you are providing apart from the "easy" installation (which is of course great for some users and I really like the way this is going regard the use of OTel vs. Elastic APM).

Regards,
Jan

Hi, @JanKnipp.

Thanks for raising this great question, and we're also pleased to hear that you like the way our efforts are being directed!

We've been thinking about this, and we have not yet settled on a final decision.

One option is that we ship things such as processors in their own package, with Elastic.OpenTelemetry pulling those in. This would allow processors to be referenced without the global override of AddOpenTelemetry being included. The risk is that we don't fully have a scope of everything we may need to ship in this fashion. It could lead to naming challenges for packages (hurting discoverability) or simply a package explosion. The user story of referencing a single package (as we have right now) is appealing.

A second option we've been considering is having an extra option on ElasticOpenTelemetryOptions, something like SkipElasticDefaults or UseVanillaSdk. When enabled, we'd skip any customisations so that calling our version of AddOpenTelemetry behaves just as it does right now with the OTel SDK. Specific processors, etc., could then still be added manually, offering complete control over how the SDK is configured.

I have my own thoughts on what I prefer, but we'd love to get your feedback. We'd also like to understand your "special configurations" and how the current design prevents them. Are you able to share anything concrete? This would be valuable to guide our design decisions going forward. We'd hoped consumers would be able to configure the OTel components without our additions behind the scenes causing any impedance.

In the meantime, working around the lack of better options is possible by directly calling the original "vanilla" OpenTelemetry AddOpenTelemetry method. This would register the OTel SDK without any of our additions. Those can then be selectively reintroduced. For example, the tracer processors can be added by calling our public extension method.

Microsoft.Extensions.DependencyInjection.OpenTelemetryServicesExtensions.AddOpenTelemetry(builder.Services)
	.WithTracing(t => t
		.AddSource(Api.ActivitySourceName)
		.AddElasticProcessors());

Hi @stevejgordon,

we're actually not that super special but there are some things in our configuration that are a bit different. First of all we currently do not use the logging parts of OTel. Things might have changed by now but last time I checked the logging integration was not as good as using serilog with the opentelemetry sink (and we're also logging to file for compliance reasons).
In Otel we have configured a few resource detectors and since some of them overwrite i.e. service.name we have defined an order in which the resourcebuilder is created to follow our guidelines (aka first use some generic assembly name, then take all the detectors, then customattributes and if an environmentvariable is set take this one as a last resort).
We have certain Sources which we do not want to see, so they are not registered (aka not using a simple "*" for the addition of sources). For example there are certain traces that are enabled within masstransit and they already do a nice job of instrumenting something like service bus. So for this scenario we do not want the azure service bus provided traces since it will create too much noise, but for other azure resources we still need them.
Traces wise we've added a quartz instrumentation and the "standard" ones. With metrics we have some additionals like EventCounters, Runtime, Process, ... and of course the ones that are now already present in multiple libraries without any additional otel instrumentation packages.

So from my point of view it would be great to keep all this exactly as it is and to stay as close to native Otel as possible while being as compatible to elastic as possible. We'd rather skip all the elastic goodies then switching to something like an Elastic wrapper around Otel. I can see why this is a very good option for a lot of users or people who switch over from the elastic apm nuget packages.

I have no problem with adding i.e. the elastic processors with an AddElasticProcessors extension it would be great if this would be documented in some place (one the complexity gets higher than this).

Rereading the documentation i'm currently not so sure any more if I missunderstood the ability to customize the elastic provided otel packages and all this would be possible.

@JanKnipp I've refined our configuration story and included some documentation around its use. In short, there's a new option to selectively opt into our defaults (or disable them for all signals). Does that provide enough control for your scenario?

This will make it into the next alpha (hopefully quite soon)

Regarding:

Things might have changed by now but last time I checked the logging integration was not as good as using serilog with the opentelemetry sink (and we're also logging to file for compliance reasons).

Did you recall what you felt wasn't as good in this set up? I'd be curious to see if that's something that can be addressed upstream if it's still an issue.