Uptime Kuma is very slow when deployed on NFS
sudoexec opened this issue Β· comments
π I have found these related issues/pull requests
No related issues.
π‘οΈ Security Policy
- I agree to have read this project Security Policy
Description
Deployed in k8s.
I have 54 monitors and make them all in groups.
The homepage is always return blank pages without any monitors(or waiting for a long time there will show monitors), and the settings page is very slowly.
In the devtools,websocket sometimes is pending, sometimes return very slowly(getTags request).
π Reproduction steps
At first, there're a few monitors, it works fine.
But now I have 54, it doesn't work well.
In my situation, when you add more and more monitors, this will reproduce.
π Expected behavior
When opening the homepage, show all monitors immediately.
π Actual Behavior
The websocket communication is very slowly and sometimes failed.
π» Uptime-Kuma Version
1.23.11
π» Operating System and Arch
k8s
π Browser
Chromium 123.0.6312.105/FireFox 124.0.2
π₯οΈ Deployment Environment
- Runtime: k8s
- Database: sqlite
- Filesystem used to store the database on: nfs
- number of monitors: 54
π Relevant log output
Monitor #23 'Group': Failing: Child inaccessible | Interval: 60 seconds | Type: group | Down Count: 0 | Resend Interval: 0
Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
at Client_SQLite3.acquireConnection (/app/node_modules/knex/lib/client.js:312:26)
at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:287:28)
at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19)
at async RedBeanNode.normalizeRaw (/app/node_modules/redbean-node/dist/redbean-node.js:572:22)
at async RedBeanNode.getRow (/app/node_modules/redbean-node/dist/redbean-node.js:558:22)
at async Monitor.calcUptime (/app/server/model/monitor.js:1255:22)
at async Monitor.sendUptime (/app/server/model/monitor.js:1321:24)
at async Monitor.sendStats (/app/server/model/monitor.js:1189:13) {
sql: '\n' +
' SELECT\n' +
' -- SUM all duration, also trim off the beat out of time window\n' +
' SUM(\n' +
' CASE\n' +
' WHEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400 < duration\n' +
' THEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400\n' +
' ELSE duration\n' +
' END\n' +
' ) AS total_duration,\n' +
'\n' +
' -- SUM all uptime duration, also trim off the beat out of time window\n' +
' SUM(\n' +
' CASE\n' +
' WHEN (status = 1 OR status = 3)\n' +
' THEN\n' +
' CASE\n' +
' WHEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400 < duration\n' +
' THEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400\n' +
' ELSE duration\n' +
' END\n' +
' END\n' +
' ) AS uptime_duration\n' +
' FROM heartbeat\n' +
' WHERE time > ?\n' +
' AND monitor_id = ?\n' +
' ',
bindings: [
'2024-03-08 09:49:42',
'2024-03-08 09:49:42',
'2024-03-08 09:49:42',
'2024-03-08 09:49:42',
'2024-03-08 09:49:42',
48
]
}
Filesystem used to store the database on: nfs
Please refer to the wiki why using NFS is not a good idea for an database both to prevent db-corruption and performance.
This issue might also be reated to the performance problems of V1 (having to read the entire table) resolved in the upcoming V2.0 release. Please see #4500
=> Have you checked your retention? What is the size of the database?
The database size is 228M and there're 1531440 records in heartbeat table.
Okay, so just NFS being unsuitable for running a database.
Please migrate to local storage instead as suggested in the installation guide.
Thanks for your help. I migrate the storage to hostpath, and it worked.