MongoDb 5 Docker failed to start after Upgrade
HungryHowies opened this issue · comments
All,
I decided to upgrade Graylog but it requires MongoDb-5.0+. Right now I'm using MongoDb-4.4.18. I pulled new image MongoDb -5.0 adjusted my Docker-compose to use new image.
Error received
WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!
see https://jira.mongodb.org/browse/SERVER-54407
see also https://www.mongodb.com/community/forums/t/mongodb-5-0-cpu-intel-g4650-compatibility/116610/2
see also https://github.com/docker-library/mongo/issues/485#issuecomment-891991814
Not much I can do about CPU at this moment.
Docker-Compose
version: '3'
services:
# MongoDB: https://hub.docker.com/_/mongo/
mongodb:
# Container time Zone
#image: mongo:4.4.18
image: mongo:5
network_mode: bridge
# DB in share for persistence
volumes:
- mongo_data:/data/db
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64
# image: opensearchproject/opensearch:1.3.2
network_mode: bridge
#data folder in share for persistence
volumes:
- es_data:/usr/share/elasticsearch/data
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
mem_limit: 1g
graylog:
#image: graylog/graylog-enterprise:4.3.3-jre11
image: graylog/graylog-enterprise:4.3.9-jre11
#image: graylog/graylog-enterprise:5.0.0
network_mode: bridge
dns:
- 192.168.1.15
- 192.168.1.16
# journal and config directories in local NFS share for persistence
volumes:
- graylog_journal:/usr/share/graylog/data/journal
# - graylog_bin:/usr/share/graylog/bin
Steps Executed
root@ansible:/usr/local/bin# docker-compose up -d
bin_elasticsearch_1 is up-to-date
Recreating bin_mongodb_1 ... done
bin_graylog_1 is up-to-date
root@ansible:/usr/local/bin#
No Mongo Container
root@ansible:/usr/local/bin# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
eb22a62cc20f graylog/graylog-enterprise:4.3.9-jre11 "tini -- /docker-ent…" 52 minutes ago Up 51 minutes (unhealthy) 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:5044->5044/tcp, :::5044->5044/tcp, 0.0.0.0:25->25/udp, :::25->25/udp, 0.0.0.0:5055->5055/tcp, :::5055->5055/tcp, 0.0.0.0:8443->8443/tcp, 0.0.0.0:5555->5555/udp, :::8443->8443/tcp, :::5555->5555/udp, 0.0.0.0:8514->8514/tcp, :::8514->8514/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp, 0.0.0.0:9300->9300/tcp, 0.0.0.0:8514->8514/udp, :::9300->9300/tcp, :::8514->8514/udp, 0.0.0.0:9515->9515/udp, :::9515->9515/udp, 0.0.0.0:9515->9515/tcp, :::9515->9515/tcp, 0.0.0.0:13301-13302->13301-13302/tcp, :::13301-13302->13301-13302/tcp, 0.0.0.0:12201->12201/udp, :::12201->12201/udp, 0.0.0.0:51420->51420/tcp, :::51420->51420/tcp, 0.0.0.0:49184->1281/tcp, :::49184->1281/tcp, 0.0.0.0:49183->1525/tcp, :::49183->1525/tcp bin_graylog_1
faf618e2ca8c docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64 "/tini -- /usr/local…" 52 minutes ago Up 52 minutes 9200/tcp, 9300/tcp bin_elasticsearch_1
root@ansible:/usr/local/bin#
Logs found
root@ansible:/usr/local/bin# docker-compose logs -f | grep mongo | more
graylog_1 | 2022-12-07 20:19:37,899 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:4, serverValue:4}] to mongo:27017
mongodb_1 |
mongodb_1 | WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!
mongodb_1 | see https://jira.mongodb.org/browse/SERVER-54407
mongodb_1 | see also https://www.mongodb.com/community/forums/t/mongodb-5-0-cpu-intel-g4650-compatibility/116610/2
mongodb_1 | see also https://github.com/docker-library/mongo/issues/485#issuecomment-891991814
mongodb_1 |
graylog_1 | 2022-12-07 20:19:41,517 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:5, serverValue:5}] to mongo:27017
graylog_1 | 2022-12-07 20:19:45,432 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:6, serverValue:6}] to mongo:27017
graylog_1 | 2022-12-07 20:19:45,438 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:7, serverValue:7}] to mongo:27017
graylog_1 | 2022-12-07 20:19:45,445 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:9, serverValue:9}] to mongo:27017
graylog_1 | 2022-12-07 20:19:45,452 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:8, serverValue:8}] to mongo:27017
graylog_1 | 2022-12-07 20:19:45,523 INFO : org.graylog2.periodical.Periodicals - Starting [org.graylog.plugins.auditlog.mongodb.MongoAuditLogPeriodical] periodical in [0s], polling e
very [3600s].
graylog_1 | 2022-12-07 20:19:45,532 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:10, serverValue:10}] to mongo:27017
graylog_1 | 2022-12-07 20:19:45,846 INFO : org.graylog2.lookup.LookupTableService - Data Adapter watchlist-mongo/627330d4e1c2a911d774918d [@4609788a] STARTING
graylog_1 | 2022-12-07 20:19:45,850 INFO : org.graylog2.lookup.LookupTableService - Data Adapter watchlist-mongo/627330d4e1c2a911d774918d [@4609788a] RUNNING
bin_mongodb_1 exited with code 132
graylog_1 | 2022-12-07 20:19:47,866 INFO : org.graylog2.lookup.LookupTableService - Starting lookup table watchlist/627330d4e1c2a911d7749191 [@1f7c51b6] using cache watchlist-cache/6
27330d4e1c2a911d774918f [@7c33192c], data adapter watchlist-mongo/627330d4e1c2a911d774918d [@4609788a]
graylog_1 | 2022-12-07 20:31:21,025 INFO : org.mongodb.driver.connection - Closed connection [connectionId{localValue:5, serverValue:5}] to mongo:27017 because there was a socket exc
eption raised on another connection from this pool.
graylog_1 | java.lang.RuntimeException: com.mongodb.MongoNodeIsRecoveringException: Command failed with error 11600 (InterruptedAtShutdown): 'interrupted at shutdown' on server mongo
:27017. The full response is {"ok": 0.0, "errmsg": "interrupted at shutdown", "code": 11600, "codeName": "InterruptedAtShutdown"}
graylog_1 | Caused by: com.mongodb.MongoNodeIsRecoveringException: Command failed with error 11600 (InterruptedAtShutdown): 'interrupted at shutdown' on server mongo:27017. The full
response is {"ok": 0.0, "errmsg": "interrupted at shutdown", "code": 11600, "codeName": "InterruptedAtShutdown"}
Any Advice would be appreciated .
Ubuntu 22.0.4 Virtual machine on Windows Hyper-v
Docker Version
root@ansible:/usr/local/bin# docker version
Client:
Version: 20.10.12
API version: 1.41
Go version: go1.17.3
Git commit: 20.10.12-0ubuntu4
Built: Mon Mar 7 17:10:06 2022
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.12
API version: 1.41 (minimum version 1.12)
Go version: go1.17.3
Git commit: 20.10.12-0ubuntu4
Built: Mon Mar 7 15:57:50 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.5.9-0ubuntu3
GitCommit:
runc:
Version: 1.1.0-0ubuntu1.1
GitCommit:
docker-init:
Version: 0.19.0
GitCommit:
root@ansible:/usr/local/bin#
@HungryHowies The MongoDB message indicates that your CPU doesn't support the AVX instructions.
You can check your hypervisor settings if it's possible to enable these instructions for the virtual machine.
WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!
@bernd
Hey,
Thank you for the reply, much appreciated.
Yeah I did some digging. Thankfully this is only my lab Hyper-v servers. Glad I found this issue before we upgrade.
@bernd
My fix was... Unchecking the tic box
But new errors occurred.
Dec 8 17:26:43 ansible dockerd[1254]: time="2022-12-08T17:26:43.701125233-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec 8 17:26:53 ansible dockerd[1254]: time="2022-12-08T17:26:53.821076778-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/mongodb_1"
Dec 8 17:27:04 ansible dockerd[1254]: time="2022-12-08T17:27:04.004540983-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec 8 17:27:14 ansible dockerd[1254]: time="2022-12-08T17:27:14.026510867-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec 8 17:27:24 ansible dockerd[1254]: time="2022-12-08T17:27:24.092746682-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec 8 17:27:34 ansible dockerd[1254]: time="2022-12-08T17:27:34.109101675-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec 8 17:27:44 ansible dockerd[1254]: time="2022-12-08T17:27:44.126879809-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/mongo"
Dec 8 17:27:53 ansible kernel: [ 2473.152844] traps: mongod[18239] trap invalid opcode ip:56311d63fa7a sp:7ffc21b4de10 error:0 in mongod[5631195ba000+51eb000]
Dec 8 17:27:54 ansible dockerd[1254]: time="2022-12-08T17:27:54.182350966-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec 8 17:28:04 ansible dockerd[1254]: time="2022-12-08T17:28:04.426020497-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
root@ansible:/usr/local/bin#
Working on the resolve, but I'm starting to think that an upgrade might not be my solution, perhaps a fresh install.
I thought I resolved it, but its a "No Go". I aware of the correct cpu architecture type is needed. TBH this is the first time I upgraded software and the service would not start unless I have to correct CPU. Were still looking into. BTW I tried CentOS 7, Ubuntu 18,20,22. and the latest Docker/Docker-compose. Same out come.
https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX
@HungryHowies Does the CPU on your hypervisor support the AVX instructions?
The screenshot doesn't show the exact CPU model.
Sort answerer no it doesn't.
This is unfortunate that the CPU's on our blade servers is prevent us to upgrade or use the newer software. True they may be a little old, also true Advanced Vector Extensions (AVX) are additions to the x86 instruction set architecture. Put simply, the additional instruction set allow compatible processors to perform more demanding functions when used with compatible software. So I am aware but for now our option at this point is compile Mongo OR replace all the CPU that is incompatible (i.e., this would be very expensive and time consuming) OR stay with old version OR move on.
@bernd Here is my Test GL Server specs. Notice we do have "sse " but not AVX.
[root@graylog graylog]# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0xffffffff
cpu MHz : 2400.083
cache size : 12288 KB
physical id : 0
siblings : 6
core id : 0
cpu cores : 6
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl xtopology eagerfpu pni cx16 hypervisor lahf_lm ibrs ibpb spec_ctrl arch_capabilities
bogomips : 4800.16
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
I appreciate your replay, and thank you.