open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java

Home Page:https://opentelemetry.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

opentelemetry-java-agent doesn't collect jvm metrics

fabjesus opened this issue · comments

Describe the bug

I'm facing an issue when trying to collect JVM metrics from Trino using the OpenTelemetry Collector Contrib and the OpenTelemetry Instrumentation Agent. I'm running the OpenTelemetry Java Instrumentation agent and configuring the collector contributed by OpenTelemetry to collect and export Trino's JVM metrics to Prometheus and I'm not able to see any metrics.

I would like to request assistance in understanding why my configuration rules are not being applied correctly for Trino's JVM metrics and how I can resolve this issue so that Trino's JVM metrics are collected and exported as expected.

Thanks in advance.

Steps to reproduce

-javaagent:/path/to/opentelemetry-javaagent.jar -Dotel.jmx.config=/etc/trino/rules.yaml -Dotel.jmx.target.system=jvm -Dotel.javaagent.enabled=true

defined environment variables:

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_EXPORTER_OTLP_METRICS_PROTOCOL=grpc
OTEL_LOGS_EXPORTER=none
OTEL_TRACES_EXPORTER=none
OTEL_METRICS_EXPORTER=otlp
OTEL_JAVAAGENT_DEBUG=true

rules.yaml

---
rules:
  - bean: trino.memory:name=general,type=memorypool
    mapping:
      freebytes:
        metric: otelcol_trino_memorypool_freebytes
        type: gauge
        desc: Pool memory freebytes
        unit: By

otel-collector:

receivers:
  otlp: #ingests OTLP formatted data from an app/system or another OTel collector
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

  hostmetrics: # collects host metrics from the specified categories
    collection_interval: 1m
    scrapers:
      cpu:
      load:
      memory:
      disk:
      filesystem:
      network:
      paging:

  # Collect own metrics
  prometheus: #ingests metrics in Prometheus format -- pre-configured to scrape the collector’s Prometheus endpoint
    config:
      scrape_configs:
        - job_name: "otel-collector"
          scrape_interval: 10s
          static_configs:
            - targets: ["0.0.0.0:8888"]
processors:
  batch: # transmits telemetry data in batches, instead of streaming each data point or event.

exporters:
  prometheus:
    endpoint: "0.0.0.0:9275"
    namespace: otelcol
    send_timestamps: true
    enable_open_metrics: true

  logging:
    verbosity: detailed

service:
  pipelines:
    metrics:
      receivers: [otlp, prometheus, hostmetrics]
      processors: [batch]
      exporters: [logging, prometheus]

Call the prometheus endpoint of my OTEL collector.

Expected behavior

Collect Trino's JVM metrics

Actual behavior

Trino's JVM metrics missing. For the configuration rule defined for mapping the attribute memorypool of bean trino.memory:name=general,type=memorypool to otelcol_trino_memorypool_freebytes. It doesn't work. I can't see the metrics otelcol_trino_memorypool_freebytes when calling the prometheus endpoint of my OTEL collector as below.

Javaagent or library instrumentation version

v2.3.0

Environment

JDK: openjdk 11.0.22
OS: Debian 10

Additional context

Using the JMX connector on Trino gives me the opportunity to query Java Management Extensions, and I can see the desired metric .

./trino-cli-436-executable.jar --server trino-otel.xpto.com:8080

trino> select * from jmx.current."trino.memory:name=general,type=memorypool";
 freebytes  |  maxbytes  | reservedbytes | reservedrevocablebytes |                 node                 |                object_name
------------+------------+---------------+------------------------+--------------------------------------+-------------------------------------------
 2727346112 | 2727346176 |            64 |                      0 | e48adfd0-14ca-5e7d-897e-9968b7ae72eb | trino.memory:type=MemoryPool,name=general
(1 row)

No response

Please provide instructions on how to run trino in a way that would allow connecting jconsole.

@laurit thank you for your response.

To enable JConsole connection with Trino, you can follow the instructions provided in the official Trino documentation at the following link: Trino JMX Configuration.

Additionally, I've configured my Trino instance to allow JMX connection by adding the necessary JMX connector configuration to the Trino configuration file. Here is a snippet of my configuration:

jmx:
    jmx.rmiregistry.port=7199
    jmx.rmiserver.port=7199

This is enough or do you need to see the entire configuration file?

try

---
rules:
  - bean: trino.memory:name=general,type=MemoryPool
    mapping:
      FreeBytes:
        metric: otelcol_trino_memorypool_freebytes
        type: gauge
        desc: Pool memory freebytes
        unit: By

Hi @laurit ,

After implementing the requested change, the problem persists. Could there be a misconfiguration on the otel-collector side? I've checked the logs file, but I can't find anything relevant.

Thanks

@fabjesus look at the output from the logging exporter. It should contain something like

ScopeMetrics #3
ScopeMetrics SchemaURL:
InstrumentationScope io.opentelemetry.jmx
Metric #0
Descriptor:
     -> Name: otelcol_trino_memorypool_freebytes
     -> Description: Pool memory freebytes
     -> Unit: By
     -> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 2024-04-21 10:06:08.971527 +0000 UTC
Timestamp: 2024-04-21 10:07:32.427186 +0000 UTC
Value: 6012954214

@laurit I can't see that output on my logging exporter for the InstrumentationScope io.opentelemetry.jmx

@fabjesus please provide detailed instructions on how this could be reproduced. Start the instructions from downloading trino. For example download https://repo1.maven.org/maven2/io/trino/trino-server/445/trino-server-445.tar.gz create configuration file etc/config.properties with content ... etc. launch with command ...

This has been automatically marked as stale because it has been marked as needing author feedback and has not had any activity for 7 days. It will be closed automatically if there is no response from the author within 7 additional days from this comment.

Hi @laurit ,

For now my team decided not use the opentelemetry. Thanks for your support. 🙏