processone / ejabberd

Robust, Ubiquitous and Massively Scalable Messaging Platform (XMPP, MQTT, SIP Server)

Home Page:https://www.process-one.net/en/ejabberd/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] MS SQL database backend reason: {port_exit,memory_allocation_failed} with Erlang >25.3

sando38 opened this issue · comments

Environment

  • ejabberd version: 24.02
  • Erlang version: 26.2
  • OS: Linux (Debian12) and Alpine Linux 3.19
  • Installed from: source | ghcr.io/processone/ejabberd:24.02

Configuration (only if needed): grep -Ev '^$|^\s*#' ejabberd.yml

loglevel: info

Errors from error.log/crash.log

2024-03-16T20:39:12.6496590Z 2024-03-16 20:32:36.414879+00:00 [info] <0.423.0>@gen_mod:start_modules/0:141 Loading modules for 8310292872-12.example.com
2024-03-16T20:39:12.6498391Z 2024-03-16 20:32:36.449292+00:00 [info] <0.374.0>@ejabberd_mnesia:create/2:270 Creating Mnesia ram table 'pubsub_last_item'
2024-03-16T20:39:12.6500208Z 2024-03-16 20:32:36.461769+00:00 [info] <0.374.0>@ejabberd_mnesia:create/2:270 Creating Mnesia ram table 'mod_register_ip'
2024-03-16T20:39:12.6502116Z 2024-03-16 20:32:36.488646+00:00 [info] <0.608.0>@mod_stun_disco:parse_listener/1:626 Going to offer STUN service: 10.244.3.7:3478 (udp)
2024-03-16T20:39:12.6504254Z 2024-03-16 20:32:36.551778+00:00 [info] <0.374.0>@ejabberd_mnesia:create/2:270 Creating Mnesia ram table 'muc_online_room'
2024-03-16T20:39:12.6506198Z 2024-03-16 20:32:36.643316+00:00 [error] <0.649.0> ODBC: received unexpected info: {tcp_closed,#Port<0.91>}
2024-03-16T20:39:12.6507055Z 
2024-03-16T20:39:12.6508629Z 2024-03-16 20:32:36.643508+00:00 [error] <0.632.0>@ejabberd_sql:check_error/2:1404 SQL query 'select name, opts from muc_room where host='conference.8310292872-12.example.com'' failed: connection_closed
2024-03-16T20:39:12.6511449Z 2024-03-16 20:32:36.643598+00:00 [error] <0.632.0>@ejabberd_sql:check_error/2:1397 SQL query 'Q80627668' at {mod_muc_sql,{293,23}} failed: connection_closed
2024-03-16T20:39:12.6513413Z 2024-03-16 20:32:36.643986+00:00 [error] <0.649.0>@gen_server:error_info/8:1391 ** Generic server <0.649.0> terminating 
2024-03-16T20:39:12.6514552Z ** Last message in was {#Port<0.69>,{exit_status,2}}
2024-03-16T20:39:12.6515218Z ** When Server state == {state,#Port<0.69>,
2024-03-16T20:39:12.6515798Z                                {<0.632.0>,
2024-03-16T20:39:12.6516387Z                                 #Ref<0.3142729776.3845652482.250765>},
2024-03-16T20:39:12.6517084Z                                <0.632.0>,undefined,on,false,false,off,
2024-03-16T20:39:12.6517768Z                                connected,undefined,0,
2024-03-16T20:39:12.6518400Z                                [#Port<0.67>,#Port<0.68>],
2024-03-16T20:39:12.6519032Z                                #Port<0.90>,#Port<0.91>}
2024-03-16T20:39:12.6519621Z ** Reason for termination ==
2024-03-16T20:39:12.6520117Z ** {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6520518Z 
2024-03-16T20:39:12.6521166Z 2024-03-16 20:32:36.645543+00:00 [error] <0.649.0>@proc_lib:crash_report/4:584 CRASH REPORT:
2024-03-16T20:39:12.6522026Z   crasher:
2024-03-16T20:39:12.6522380Z     initial call: odbc:init/1
2024-03-16T20:39:12.6522837Z     pid: <0.649.0>
2024-03-16T20:39:12.6523221Z     registered_name: []
2024-03-16T20:39:12.6523768Z     exception exit: {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6524678Z       in function  gen_server:handle_common_reply/8 (gen_server.erl, line 1226)
2024-03-16T20:39:12.6525483Z     ancestors: [odbc_sup,<0.344.0>]
2024-03-16T20:39:12.6525987Z     message_queue_len: 1
2024-03-16T20:39:12.6526524Z     messages: [{'EXIT',#Port<0.69>,normal}]
2024-03-16T20:39:12.6527072Z     links: [<0.632.0>,<0.345.0>]
2024-03-16T20:39:12.6527535Z     dictionary: []
2024-03-16T20:39:12.6527907Z     trap_exit: true
2024-03-16T20:39:12.6528287Z     status: running
2024-03-16T20:39:12.6528658Z     heap_size: 6772
2024-03-16T20:39:12.6529036Z     stack_size: 28
2024-03-16T20:39:12.6529418Z     reductions: 19518
2024-03-16T20:39:12.6529805Z   neighbours:
2024-03-16T20:39:12.6530031Z 
2024-03-16T20:39:12.6530830Z 2024-03-16 20:32:36.649964+00:00 [warning] <0.632.0>@ejabberd_sql:handle_reconnect/2:491 mssql connection failed:
2024-03-16T20:39:12.6531919Z ** Reason: {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6532510Z ** Retry after: 5 seconds
2024-03-16T20:39:12.6533452Z 2024-03-16 20:32:36.649979+00:00 [error] <0.345.0>@supervisor:do_restart/3:759 SUPERVISOR REPORT:
2024-03-16T20:39:12.6534377Z     supervisor: {local,odbc_sup}
2024-03-16T20:39:12.6534891Z     errorContext: child_terminated
2024-03-16T20:39:12.6535462Z     reason: {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6536057Z     offender: [{pid,<0.649.0>},
2024-03-16T20:39:12.6536516Z                {id,[]},
2024-03-16T20:39:12.6537002Z                {mfargs,{odbc,start_link_sup,undefined}},
2024-03-16T20:39:12.6537632Z                {restart_type,temporary},
2024-03-16T20:39:12.6538183Z                {significant,false},
2024-03-16T20:39:12.6538677Z                {shutdown,7000},
2024-03-16T20:39:12.6539153Z                {child_type,worker}]
2024-03-16T20:39:12.6539496Z 
2024-03-16T20:39:12.6540870Z 2024-03-16 20:32:36.720319+00:00 [warning] <0.675.0>@ejabberd_captcha:get_prog_name/0:444 The option captcha_cmd is not configured, but some module wants to use the CAPTCHA feature.
2024-03-16T20:39:12.6543495Z 2024-03-16 20:32:36.754287+00:00 [info] <0.190.0>@ejabberd_cluster_mnesia:wait_for_sync/1:123 Waiting for Mnesia synchronization to complete
2024-03-16T20:39:12.6546970Z 2024-03-16 20:32:36.772555+00:00 [warning] <0.511.0>@ejabberd_pkix:log_warnings/1:393 Invalid certificate in /opt/ejabberd/certs/custom-cert/cert.pem: at line 1: unused private key
2024-03-16T20:39:12.6549459Z 2024-03-16 20:32:36.793597+00:00 [warning] <0.511.0>@ejabberd_pkix:check_domain_certfiles/1:312 No certificate found matching 8310292872-12.example.com
2024-03-16T20:39:12.6551565Z 2022024-03-16 20:32:36.643316+00:00 [error] <0.649.0> ODBC: received unexpected info: {tcp_closed,#Port<0.91>}
2024-03-16T20:39:12.6552430Z 
2024-03-16T20:39:12.6553964Z 2024-03-16 20:32:36.643508+00:00 [error] <0.632.0>@ejabberd_sql:check_error/2:1404 SQL query 'select name, opts from muc_room where host='conference.8310292872-12.example.com'' failed: connection_closed
2024-03-16T20:39:12.6556589Z 2024-03-16 20:32:36.643598+00:00 [error] <0.632.0>@ejabberd_sql:check_error/2:1397 SQL query 'Q80627668' at {mod_muc_sql,{293,23}} failed: connection_closed
2024-03-16T20:39:12.6558551Z 2024-03-16 20:32:36.643986+00:00 [error] <0.649.0>@gen_server:error_info/8:1391 ** Generic server <0.649.0> terminating 
2024-03-16T20:39:12.6559688Z ** Last message in was {#Port<0.69>,{exit_status,2}}
2024-03-16T20:39:12.6560349Z ** When Server state == {state,#Port<0.69>,
2024-03-16T20:39:12.6560925Z                                {<0.632.0>,
2024-03-16T20:39:12.6561508Z                                 #Ref<0.3142729776.3845652482.250765>},
2024-03-16T20:39:12.6562205Z                                <0.632.0>,undefined,on,false,false,off,
2024-03-16T20:39:12.6562880Z                                connected,undefined,0,
2024-03-16T20:39:12.6563515Z                                [#Port<0.67>,#Port<0.68>],
2024-03-16T20:39:12.6564143Z                                #Port<0.90>,#Port<0.91>}
2024-03-16T20:39:12.6564731Z ** Reason for termination ==
2024-03-16T20:39:12.6565221Z ** {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6565620Z 
2024-03-16T20:39:12.6566287Z 2024-03-16 20:32:36.645543+00:00 [error] <0.649.0>@proc_lib:crash_report/4:584 CRASH REPORT:
2024-03-16T20:39:12.6567211Z   crasher:
2024-03-16T20:39:12.6567598Z     initial call: odbc:init/1
2024-03-16T20:39:12.6568080Z     pid: <0.649.0>
2024-03-16T20:39:12.6568491Z     registered_name: []
2024-03-16T20:39:12.6569067Z     exception exit: {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6570030Z       in function  gen_server:handle_common_reply/8 (gen_server.erl, line 1226)
2024-03-16T20:39:12.6570912Z     ancestors: [odbc_sup,<0.344.0>]
2024-03-16T20:39:12.6571486Z     message_queue_len: 1
2024-03-16T20:39:12.6572147Z     messages: [{'EXIT',#Port<0.69>,normal}]
2024-03-16T20:39:12.6572771Z     links: [<0.632.0>,<0.345.0>]
2024-03-16T20:39:12.6573276Z     dictionary: []
2024-03-16T20:39:12.6573697Z     trap_exit: true
2024-03-16T20:39:12.6574131Z     status: running
2024-03-16T20:39:12.6574544Z     heap_size: 6772
2024-03-16T20:39:12.6574968Z     stack_size: 28
2024-03-16T20:39:12.6575383Z     reductions: 19518
2024-03-16T20:39:12.6575809Z   neighbours:
2024-03-16T20:39:12.6576051Z 
2024-03-16T20:39:12.6576798Z 2024-03-16 20:32:36.649979+00:00 [error] <0.345.0>@supervisor:do_restart/3:759 SUPERVISOR REPORT:
2024-03-16T20:39:12.6577739Z     supervisor: {local,odbc_sup}
2024-03-16T20:39:12.6578269Z     errorContext: child_terminated
2024-03-16T20:39:12.6578835Z     reason: {port_exit,memory_allocation_failed}
2024-03-16T20:39:12.6579427Z     offender: [{pid,<0.649.0>},
2024-03-16T20:39:12.6579877Z                {id,[]},
2024-03-16T20:39:12.6580372Z                {mfargs,{odbc,start_link_sup,undefined}},
2024-03-16T20:39:12.6580992Z                {restart_type,temporary},
2024-03-16T20:39:12.6581555Z                {significant,false},
2024-03-16T20:39:12.6582045Z                {shutdown,7000},
2024-03-16T20:39:12.6582536Z                {child_type,worker}]
2024-03-16T20:39:12.6583207Z 
2024-03-16T20:39:12.6584421Z 4-03-16 20:32:36.793707+00:00 [warning] <0.511.0>@ejabberd_pkix:check_domain_certfiles/1:312 No certificate found matching pubsub.8310292872-12.example.com
2024-03-16T20:39:12.6587024Z 2024-03-16 20:32:36.793821+00:00 [warning] <0.511.0>@ejabberd_pkix:check_domain_certfiles/1:312 No certificate found matching upload.8310292872-12.example.com
2024-03-16T20:39:12.6589450Z 2024-03-16 20:32:36.793968+00:00 [warning] <0.511.0>@ejabberd_pkix:check_domain_certfiles/1:312 No certificate found matching conference.8310292872-12.example.com
2024-03-16T20:39:12.6592258Z 2024-03-16 20:32:36.794082+00:00 [warning] <0.511.0>@ejabberd_pkix:check_domain_certfiles/1:312 No certificate found matching proxy.8310292872-12.example.com
2024-03-16T20:39:12.6595095Z 2024-03-16 20:32:36.795105+00:00 [info] <0.190.0>@ejabberd_app:start/2:63 ejabberd 24.2.0 is started in the node 'ejabberd@ejabberd-0.ejabberd-headless.default.svc.cluster.local' in 1.31s
2024-03-16T20:39:12.6597822Z 2024-03-16 20:32:36.795411+00:00 [info] <0.672.0>@ejabberd_listener:init/4:163 Start accepting TCP connections at 10.244.3.7:7777 for mod_proxy65_stream
2024-03-16T20:39:12.6600095Z 2024-03-16 20:32:36.795486+00:00 [info] <0.505.0>@ejabberd_listener:init/4:163 Start accepting TCP connections at 127.0.0.1:5281 for ejabberd_http
2024-03-16T20:39:12.6602346Z 2024-03-16 20:32:36.795559+00:00 [info] <0.506.0>@ejabberd_listener:init/4:163 Start accepting TCP connections at [::]:5222 for ejabberd_c2s
2024-03-16T20:39:12.6604556Z 2024-03-16 20:32:36.795579+00:00 [info] <0.508.0>@ejabberd_listener:init/4:163 Start accepting TLS connections at [::]:5443 for ejabberd_http
2024-03-16T20:39:12.6607033Z 2024-03-16 20:32:36.795590+00:00 [info] <0.510.0>@ejabberd_listener:init/4:129 Start accepting UDP connections at [::]:3478 for ejabberd_stun
2024-03-16T20:39:12.6609256Z 2024-03-16 20:32:36.795567+00:00 [info] <0.507.0>@ejabberd_listener:init/4:163 Start accepting TCP connections at [::]:5280 for ejabberd_http
2024-03-16T20:39:12.6611427Z 2024-03-16 20:32:36.795583+00:00 [info] <0.509.0>@ejabberd_listener:init/4:163 Start accepting TCP connections at [::]:5269 for ejabberd_s2s_in
2024-03-16T20:39:12.6613508Z 2024-03-16 20:32:37.706391+00:00 [info] <0.505.0>@ejabberd_listener:accept/7:344 (<0.745.0>) Accepted connection 127.0.0.1:47566 -> 127.0.0.1:5281

Bug description

I started updating the ejabberd helm chart to newest 24.02. While doing that and updating the corresponding container image I found that there is a problem with MS SQL. Therefore, I tested the official ghcr.io/processone/ejabberd:24.02 as well to see if it depends on my custom image, but it does not (see error message above).

reason: {port_exit,memory_allocation_failed}

What solved the issue was reverting the container build from erlang 26.2 / elixir 16.1 to erlang 25.3 / elixir 14.4 . I haven't tested anything in between.

It most likely is unrelated to Alpine packages, because I also tried it with a custom Debian based variant with the same combination erlang 26.2 / elixir 16.1 .

@badlop as discussed