Log collector pods cannot start because of bad ipv6 address:port

Question

Log collector pods cannot start because of bad ipv6 address:port

RohitMahadevan1994 opened this issue 2 years ago · comments

Description of problem:
We have an IPv6 openshift baremetal installation and we're deploying cluster-logging operator. The collector pods in openshift-logging namespace goes in a crashloop with a stacktrace that says

Setting each total_size_limit for 1 buffers to 143866249420 bytes
Setting queued_chunks_limit_size for each buffer to 17150
Setting chunk_limit_size for each buffer to 8388608
2022-07-19 19:16:31 +0000 [warn]: '@' is the system reserved prefix. It works in the nested configuration for now but it will be rejected: @timestamp
/usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/elasticsearch_compat.rb:8: warning: already initialized constant TRANSPORT_CLASS
/usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/elasticsearch_compat.rb:3: warning: previous definition of TRANSPORT_CLASS was here
/usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/elasticsearch_compat.rb:25: warning: already initialized constant SELECTOR_CLASS
/usr/local/share/gems/gems/fluent-plugin-elasticsearch-5.2.1/lib/fluent/plugin/elasticsearch_compat.rb:20: warning: previous definition of SELECTOR_CLASS was here
2022-07-19 19:16:33 +0000 [warn]: 'enable' parameter is deprecated: Use section
2022-07-19 19:16:33 +0000 [warn]: 'certificate_path' parameter is deprecated: Use cert_path in section
2022-07-19 19:16:33 +0000 [warn]: 'private_key_path' parameter is deprecated: Use private_key_path in section
2022-07-19 19:16:33 +0000 [warn]: For security reason, setting private_key_passphrase is recommended when cert_path is specified
2022-07-19 19:16:33 +0000 [error]: unexpected error error_class=URI::InvalidURIError error="bad URI(is not URI?): https://fd00:17:2:0:e0db:5522:c50c:2b:24231/"
2022-07-19 19:16:33 +0000 [error]: /usr/share/ruby/uri/rfc3986_parser.rb:67:in split' 2022-07-19 19:16:33 +0000 [error]: /usr/share/ruby/uri/rfc3986_parser.rb:73:in parse'
2022-07-19 19:16:33 +0000 [error]: /usr/share/ruby/uri/common.rb:237:in parse' 2022-07-19 19:16:33 +0000 [error]: /usr/share/ruby/uri/common.rb:743:in URI'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin_helper/http_server/server.rb:39:in initialize' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin_helper/http_server.rb:96:in new'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin_helper/http_server.rb:96:in http_server_create_https_server' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/plugin_helper/http_server.rb:67:in http_server_create_http_server'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluent-plugin-prometheus-2.0.2/lib/fluent/plugin/in_prometheus.rb:109:in start' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:203:in block in start'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:192:in block (2 levels) in lifecycle' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:191:in each'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:191:in block in lifecycle' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:178:in each'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:178:in lifecycle' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/root_agent.rb:202:in start'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/engine.rb:248:in start' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/engine.rb:147:in run'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/supervisor.rb:719:in block in run_worker' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/supervisor.rb:970:in main_process'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/supervisor.rb:710:in run_worker' 2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/lib/fluent/command/fluentd.rb:377:in <top (required)>'
2022-07-19 19:16:33 +0000 [error]: /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:59:in require' 2022-07-19 19:16:33 +0000 [error]: /usr/share/rubygems/rubygems/core_ext/kernel_require.rb:59:in require'
2022-07-19 19:16:33 +0000 [error]: /usr/local/share/gems/gems/fluentd-1.14.5/bin/fluentd:15:in <top (required)>' 2022-07-19 19:16:33 +0000 [error]: /usr/local/bin/fluentd:23:in load'
2022-07-19 19:16:33 +0000 [error]: /usr/local/bin/fluentd:23:in `

'
2022-07-19 19:16:33 +0000 [error]: unexpected error error_class=URI::InvalidURIError error="bad URI(is not URI?): https://fd00:17:2:0:e0db:5522:c50c:2b:24231/"
2022-07-19 19:16:33 +0000 [error]: suppressed same stacktrace

It looks like there's a malformed ipv6 address syntax where the [] is missing before the port
Here fd00:17:2:0:e0db:5522:c50c:2b is the pod_ip.

Version-Release number of selected component (if applicable):
OCP 4.10 && fluentd-1.14.5

How reproducible:
Deploy ipv6 ocp and logging operator

Steps to Reproduce:

Install IPv6 OCP 4.10 baremetal
Install cluster logging operator

Actual results:
stacktrace - error_class=URI::InvalidURIError error="bad URI(is not URI?): https://fd00:17:2:0:e0db:5522:c50c:2b:24231/"

Expected results:
Fluentd process starts up and collector pods are happy

Additional info:

This works just fine in an Ipv4/dual-stack environment
fluent/fluentd#3603 (not sure if this is related)

OpenShift Bot · Answer 1 · Tue Oct 18 2022 17:00:48 GMT+0800 (China Standard Time)

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Jeff Cantrill · Answer 2 · Tue Oct 18 2022 22:50:44 GMT+0800 (China Standard Time)

Should be resolved by #1683