openobserve / openobserve

πŸš€ 10x easier, πŸš€ 140x lower storage cost, πŸš€ high performance, πŸš€ petabyte scale - Elasticsearch/Splunk/Datadog alternative for πŸš€ (logs, metrics, traces, RUM, Error tracking, Session replay).

Home Page:https://openobserve.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve compactor memory usage

TessaIO opened this issue Β· comments

Which OpenObserve functionalities are relevant/related to the feature request?

Compactor

Description

We're setting ZO_COMPACT_MAX_FILE_SIZE=256 and ZO_FILE_MOVE_THREAD_NUM=4 for the compactor, but we're having 15Gb of memory usage, which is very high. We also have 600 fields which seems to be the reason for this high memory usage.
It would be nice if you could make the memory footprint as low as possible.

Proposed solution

N/A

Alternatives considered

N/A

cc @hengfeiyang LMK if I can help with something here

@TessaIO You are running a single node, what is you full config, can you share that?

@hengfeiyang sorry for the late response. No, we're running on multiple nodes.
Here's the full config:

O2_CALLBACK_URL: https://openobserve.example.com/web/cb
O2_DEX_BASE_URL: https://dex.example.com/dex
O2_DEX_CLIENT_ID: o2-client
O2_DEX_CLIENT_SECRET: xxxxxxx
O2_DEX_DEFAULT_ORG: default
O2_DEX_DEFAULT_ROLE: user
O2_DEX_ENABLED: "false"
O2_DEX_GROUP_ATTRIBUTE: ou
O2_DEX_NATIVE_LOGIN_ENABLED: "true"
O2_DEX_REDIRECT_URL: https://openobserve.example.com/config/redirect
O2_DEX_ROLE_ATTRIBUTE: role
O2_DEX_SCOPES: openid profile email groups offline_access
O2_MAP_GROUP_TO_ROLE: "false"
O2_OPENFGA_BASE_URL: http://openobserve-openfga.openobserve.svc.cluster.local:8080
O2_OPENFGA_ENABLED: "false"
OTEL_OTLP_HTTP_ENDPOINT: ""
RUST_BACKTRACE: "0"
RUST_LOG: info
ZO_ACTIX_KEEP_ALIVE: "30"
ZO_ACTIX_REQ_TIMEOUT: "30"
ZO_ACTIX_SHUTDOWN_TIMEOUT: "10"
ZO_APP_NAME: openobserve
ZO_BASE_URI: ""
ZO_BLOOM_FILTER_DEFAULT_FIELDS: ""
ZO_BLOOM_FILTER_ENABLED: "true"
ZO_BLOOM_FILTER_ON_ALL_FIELDS: "true"
ZO_CLUSTER_COORDINATOR: etcd
ZO_CLUSTER_NAME: o2
ZO_COLS_PER_RECORD_LIMIT: "200"
ZO_COLUMN_TIMESTAMP: _timestamp
ZO_COMPACT_BLOCKED_ORGS: ""
ZO_COMPACT_DATA_RETENTION_DAYS: "60"
ZO_COMPACT_DELETE_FILES_DELAY_HOURS: "2"
ZO_COMPACT_ENABLED: "true"
ZO_COMPACT_INTERVAL: "60"
ZO_COMPACT_LOOKBACK_HOURS: "0"
ZO_COMPACT_MAX_FILE_SIZE: "256"
ZO_COMPACT_STEP_SECS: "3600"
ZO_COMPACT_SYNC_TO_DB_INTERVAL: "1800"
ZO_COOKIE_MAX_AGE: "2592000"
ZO_COOKIE_SAME_SITE_LAX: "true"
ZO_COOKIE_SECURE_ONLY: "false"
ZO_DATA_CACHE_DIR: ""
ZO_DATA_DB_DIR: ""
ZO_DATA_DIR: ./data/
ZO_DATA_IDX_DIR: ""
ZO_DATA_STREAM_DIR: ""
ZO_DATA_WAL_DIR: ""
ZO_DISK_CACHE_ENABLED: "true"
ZO_DISK_CACHE_GC_INTERVAL: "0"
ZO_DISK_CACHE_GC_SIZE: "10"
ZO_DISK_CACHE_MAX_SIZE: "0"
ZO_DISK_CACHE_RELEASE_SIZE: "0"
ZO_DISK_CACHE_SKIP_SIZE: "0"
ZO_DISK_CACHE_STRATEGY: lru
ZO_DISTINCT_VALUES_HOURLY: "false"
ZO_DISTINCT_VALUES_INTERVAL: "10"
ZO_ENABLE_INVERTED_INDEX: "false"
ZO_ENRICHMENT_TABLE_LIMIT: "10"
ZO_ENTRY_PER_SCHEMA_VERSION_ENABLED: "true"
ZO_ETCD_ADDR: openobserve-etcd-headless.openobserve.svc.cluster.local:2379
ZO_ETCD_CERT_FILE: ""
ZO_ETCD_CLIENT_CERT_AUTH: "false"
ZO_ETCD_COMMAND_TIMEOUT: "5"
ZO_ETCD_CONNECT_TIMEOUT: "5"
ZO_ETCD_DOMAIN_NAME: ""
ZO_ETCD_KEY_FILE: ""
ZO_ETCD_LOAD_PAGE_SIZE: "100"
ZO_ETCD_LOCK_WAIT_TIMEOUT: "600"
ZO_ETCD_PASSWORD: ""
ZO_ETCD_PREFIX: /zinc/observe/
ZO_ETCD_TRUSTED_CA_FILE: ""
ZO_ETCD_USER: ""
ZO_FEATURE_DISTINCT_EXTRA_FIELDS: ""
ZO_FEATURE_FILELIST_DEDUP_ENABLED: "false"
ZO_FEATURE_FULLTEXT_EXTRA_FIELDS: ""
ZO_FEATURE_PER_THREAD_LOCK: "false"
ZO_FEATURE_QUERY_INFER_SCHEMA: "false"
ZO_FEATURE_QUERY_PARTITION_STRATEGY: file_num
ZO_FEATURE_QUERY_QUEUE_ENABLED: "true"
ZO_FEATURE_QUICK_MODE_FIELDS: ""
ZO_FILE_MOVE_THREAD_NUM: "2"
ZO_FILE_PUSH_INTERVAL: "10"
ZO_FILE_PUSH_LIMIT: "10000"
ZO_GRPC_ADDR: ""
ZO_GRPC_MAX_MESSAGE_SIZE: "16"
ZO_GRPC_ORG_HEADER_KEY: organization
ZO_GRPC_PORT: "5081"
ZO_GRPC_STREAM_HEADER_KEY: stream-name
ZO_GRPC_TIMEOUT: "600"
ZO_HTTP_ADDR: ""
ZO_HTTP_IPV6_ENABLED: "false"
ZO_HTTP_PORT: "5080"
ZO_HTTP_WORKER_MAX_BLOCKING: "0"
ZO_HTTP_WORKER_NUM: "0"
ZO_IGNORE_FILE_RETENTION_BY_STREAM: "false"
ZO_INGEST_ALLOWED_UPTO: "24"
ZO_INGEST_FLATTEN_LEVEL: "3"
ZO_INGESTER_SERVICE_URL: ""
ZO_INSTANCE_NAME: ""
ZO_INTERNAL_GRPC_TOKEN: ""
ZO_INVERTED_INDEX_SPLIT_CHARS: ' ;,'
ZO_JSON_LIMIT: "209715200"
ZO_LOCAL_MODE: "false"
ZO_LOCAL_MODE_STORAGE: disk
ZO_LOGS_FILE_RETENTION: hourly
ZO_MAX_FILE_RETENTION_TIME: "600"
ZO_MAX_FILE_SIZE_IN_MEMORY: "256"
ZO_MAX_FILE_SIZE_ON_DISK: "64"
ZO_MEM_PERSIST_INTERVAL: "5"
ZO_MEM_TABLE_MAX_SIZE: "0"
ZO_MEMORY_CACHE_CACHE_LATEST_FILES: "false"
ZO_MEMORY_CACHE_DATAFUSION_MAX_SIZE: "0"
ZO_MEMORY_CACHE_DATAFUSION_MEMORY_POOL: ""
ZO_MEMORY_CACHE_ENABLED: "true"
ZO_MEMORY_CACHE_GC_INTERVAL: "0"
ZO_MEMORY_CACHE_GC_SIZE: "10"
ZO_MEMORY_CACHE_MAX_SIZE: "0"
ZO_MEMORY_CACHE_RELEASE_SIZE: "0"
ZO_MEMORY_CACHE_SKIP_SIZE: "0"
ZO_MEMORY_CACHE_STRATEGY: lru
ZO_META_CONNECTION_POOL_MAX_SIZE: "0"
ZO_META_CONNECTION_POOL_MIN_SIZE: "0"
ZO_META_MYSQL_DSN: url_here
ZO_META_POSTGRES_DSN: ""
ZO_META_STORE: mysql
ZO_META_TRANSACTION_LOCK_TIMEOUT: "600"
ZO_META_TRANSACTION_RETRIES: "3"
ZO_METRICS_DEDUP_ENABLED: "true"
ZO_METRICS_FILE_RETENTION: daily
ZO_METRICS_LEADER_ELECTION_INTERVAL: "30"
ZO_METRICS_LEADER_PUSH_INTERVAL: "15"
ZO_NATS_ADDR: my_custom_host.com:4222
ZO_NATS_COMMAND_TIMEOUT: "10"
ZO_NATS_CONNECT_TIMEOUT: "5"
ZO_NATS_LOCK_WAIT_TIMEOUT: "600"
ZO_NATS_PASSWORD: ""
ZO_NATS_PREFIX: o2_
ZO_NATS_QUEUE_MAX_AGE: "60"
ZO_NATS_REPLICAS: "3"
ZO_NATS_USER: ""
ZO_NODE_ROLE: all
ZO_PARQUET_COMPRESSION: zstd
ZO_PARQUET_MAX_ROW_GROUP_SIZE: "0"
ZO_PAYLOAD_LIMIT: "209715200"
ZO_PRINT_KEY_CONFIG: "false"
ZO_PRINT_KEY_EVENT: "false"
ZO_PRINT_KEY_SQL: "false"
ZO_PROF_PYROSCOPE_ENABLED: "false"
ZO_PROF_PYROSCOPE_PROJECT_NAME: openobserve
ZO_PROF_PYROSCOPE_SERVER_URL: http://localhost:4040
ZO_PROMETHEUS_HA_CLUSTER: cluster
ZO_PROMETHEUS_HA_REPLICA: __replica__
ZO_QUERY_ON_STREAM_SELECTION: "true"
ZO_QUERY_OPTIMIZATION_NUM_FIELDS: "0"
ZO_QUERY_THREAD_NUM: "0"
ZO_QUERY_TIMEOUT: "600"
ZO_QUEUE_STORE: ""
ZO_QUICK_MODE_FILE_LIST_ENABLED: "false"
ZO_QUICK_MODE_FILE_LIST_INTERVAL: "300"
ZO_QUICK_MODE_NUM_FIELDS: "500"
ZO_QUICK_MODE_STRATEGY: ""
ZO_ROUTE_MAX_CONNECTIONS: "1024"
ZO_ROUTE_TIMEOUT: "600"
ZO_RUM_API_VERSION: v1
ZO_RUM_APPLICATION_ID: ""
ZO_RUM_CLIENT_TOKEN: ""
ZO_RUM_ENABLED: "false"
ZO_RUM_ENV: ""
ZO_RUM_INSECURE_HTTP: "false"
ZO_RUM_ORGANIZATION_IDENTIFIER: default
ZO_RUM_SERVICE: ""
ZO_RUM_SITE: ""
ZO_RUM_VERSION: 0.9.1
ZO_S3_ALLOW_INVALID_CERTIFICATES: "false"
ZO_S3_BUCKET_NAME: prod-openobserve
ZO_S3_BUCKET_PREFIX: ""
ZO_S3_CONNECT_TIMEOUT: "10"
ZO_S3_FEATURE_FORCE_HOSTED_STYLE: "false"
ZO_S3_FEATURE_FORCE_PATH_STYLE: "false"
ZO_S3_FEATURE_HTTP1_ONLY: "false"
ZO_S3_FEATURE_HTTP2_ONLY: "false"
ZO_S3_PROVIDER: ""
ZO_S3_REGION_NAME: eu-central-1
ZO_S3_REQUEST_TIMEOUT: "3600"
ZO_S3_SERVER_URL: ""
ZO_S3_SYNC_TO_CACHE_INTERVAL: "600"
ZO_SKIP_SCHEMA_VALIDATION: "false"
ZO_TCP_PORT: "5514"
ZO_TELEMETRY: "false"
ZO_TELEMETRY_HEARTBEAT: "1800"
ZO_TELEMETRY_URL: https://e1.zinclabs.dev
ZO_TRACES_FILE_RETENTION: hourly
ZO_TRACING_ENABLED: "false"
ZO_TRACING_HEADER_KEY: Authorization
ZO_TRACING_HEADER_VALUE: Basic xxxxxxxx
ZO_UDP_PORT: "5514"
ZO_UI_ENABLED: "true"
ZO_UI_SQL_BASE64_ENABLED: "false"
ZO_USAGE_BATCH_SIZE: "2000"
ZO_USAGE_ORG: _meta
ZO_USAGE_REPORTING_COMPRESSED_SIZE: "false"
ZO_USAGE_REPORTING_CREDS: ""
ZO_USAGE_REPORTING_ENABLED: "false"
ZO_USAGE_REPORTING_MODE: local
ZO_USAGE_REPORTING_URL: http://localhost:5080/api/_meta/usage/_json
ZO_WAL_LINE_MODE_ENABLED: "true"
ZO_WAL_MEMORY_MODE_ENABLED: "false"
ZO_WEB_URL: ""
ZO_WIDENING_SCHEMA_EVOLUTION: "true"

@hengfeiyang Can you pinpoint what can be the problem/solution, maybe I can give this a shot.

if you are running in cluster mode and the compactor is a separate pod, there is no good idea for this case. you can try reduce:

ZO_COMPACT_MAX_FILE_SIZE=128
or
ZO_FILE_MOVE_THREAD_NUM=2

we have the same problem on our cloud, i will try to do some function optimize soon.

@TessaIO i found the issue, please change this to false:

ZO_BLOOM_FILTER_ON_ALL_FIELDS=false

Then, the compactor will work fine.

@hengfeiyang is this a new environment variable? I can't see it in this page:
https://openobserve.ai/docs/environment-variables/

@hengfeiyang fix is working, thanks!