Netflix / metaflow

:rocket: Build and manage real-life ML, AI, and data science projects with ease!

Home Page:https://metaflow.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OTEL tracing auth configuration problems

wrighting opened this issue · comments

If METAFLOW_OTEL_ENDPOINT is set and neither METAFLOW_SERVICE_AUTH_KEY or METAFLOW_SERVICE_HEADERS are set, then you get a warning message WARNING: no auth settings for Opentelemetry followed by a load of errors as it continues

self.span_exporter.shutdown()
AttributeError: 'NoneType' object has no attribute 'shutdown'

We use METAFLOW_SERVICE_AUTH_KEY for connecting to the metaflow service so it's set, when using the service, but doesn't apply to the OTEL endpoint
(at the moment we don't have auth on OTEL, but if we do then it will be a different key)

Yeah, I think the configuration would benefit from a separate header variable (equivalent to OTEL_EXPORTER_OTLP_HEADERS).

I think we've settled on monkey-patching metaflow.tracing.tracing_modules.init_tracing() so that it bypasses the TracerProvider setup and just uses our own shared library code to configure that.

Full trace on the error spam:

Exception while exporting Span batch.
Traceback (most recent call last):
  File "/app/lib/python3.10/site-packages/opentelemetry/sdk/trace/export/__init__.py", line 367, in _export_batch
    self.span_exporter.export(self.spans_list[:idx])  # type: ignore
AttributeError: 'NoneType' object has no attribute 'export'
Exception ignored in atexit callback: <bound method TracerProvider.shutdown of <opentelemetry.sdk.trace.TracerProvider object at 0x7f37866ba5f0>>
Traceback (most recent call last):
  File "/app/lib/python3.10/site-packages/opentelemetry/sdk/trace/__init__.py", line 1248, in shutdown
    self._active_span_processor.shutdown()
  File "/app/lib/python3.10/site-packages/opentelemetry/sdk/trace/__init__.py", line 173, in shutdown
    sp.shutdown()
  File "/app/lib/python3.10/site-packages/opentelemetry/sdk/trace/export/__init__.py", line 412, in shutdown
    self.span_exporter.shutdown()
AttributeError: 'NoneType' object has no attribute 'shutdown'

I think the problem is that set_otel_exporter() returns None here

else:
print("WARNING: no auth settings for Opentelemetry", file=sys.stderr)
return

which leads to None being passed here
span_processor = BatchSpanProcessor(span_exporter)
tracer_provider.add_span_processor(span_processor)