louislam / uptime-kuma

A fancy self-hosted monitoring tool

Home Page:https://uptime.kuma.pet

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Uptime Kuma is very slow when deployed on NFS

sudoexec opened this issue Β· comments

πŸ“‘ I have found these related issues/pull requests

No related issues.

πŸ›‘οΈ Security Policy

Description

Deployed in k8s.
I have 54 monitors and make them all in groups.
The homepage is always return blank pages without any monitors(or waiting for a long time there will show monitors), and the settings page is very slowly.
In the devtools,websocket sometimes is pending, sometimes return very slowly(getTags request).

πŸ‘Ÿ Reproduction steps

At first, there're a few monitors, it works fine.
But now I have 54, it doesn't work well.
In my situation, when you add more and more monitors, this will reproduce.

πŸ‘€ Expected behavior

When opening the homepage, show all monitors immediately.

πŸ˜“ Actual Behavior

The websocket communication is very slowly and sometimes failed.

🐻 Uptime-Kuma Version

1.23.11

πŸ’» Operating System and Arch

k8s

🌐 Browser

Chromium 123.0.6312.105/FireFox 124.0.2

πŸ–₯️ Deployment Environment

  • Runtime: k8s
  • Database: sqlite
  • Filesystem used to store the database on: nfs
  • number of monitors: 54

πŸ“ Relevant log output

Monitor #23 'Group': Failing: Child inaccessible | Interval: 60 seconds | Type: group | Down Count: 0 | Resend Interval: 0



Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
    at Client_SQLite3.acquireConnection (/app/node_modules/knex/lib/client.js:312:26)
    at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:287:28)
    at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19)
    at async RedBeanNode.normalizeRaw (/app/node_modules/redbean-node/dist/redbean-node.js:572:22)
    at async RedBeanNode.getRow (/app/node_modules/redbean-node/dist/redbean-node.js:558:22)
    at async Monitor.calcUptime (/app/server/model/monitor.js:1255:22)
    at async Monitor.sendUptime (/app/server/model/monitor.js:1321:24)
    at async Monitor.sendStats (/app/server/model/monitor.js:1189:13) {
  sql: '\n' +
    '            SELECT\n' +
    '               -- SUM all duration, also trim off the beat out of time window\n' +
    '                SUM(\n' +
    '                    CASE\n' +
    '                        WHEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400 < duration\n' +
    '                        THEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400\n' +
    '                        ELSE duration\n' +
    '                    END\n' +
    '                ) AS total_duration,\n' +
    '\n' +
    '               -- SUM all uptime duration, also trim off the beat out of time window\n' +
    '                SUM(\n' +
    '                    CASE\n' +
    '                        WHEN (status = 1 OR status = 3)\n' +
    '                        THEN\n' +
    '                            CASE\n' +
    '                                WHEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400 < duration\n' +
    '                                    THEN (JULIANDAY(`time`) - JULIANDAY(?)) * 86400\n' +
    '                                ELSE duration\n' +
    '                            END\n' +
    '                        END\n' +
    '                ) AS uptime_duration\n' +
    '            FROM heartbeat\n' +
    '            WHERE time > ?\n' +
    '            AND monitor_id = ?\n' +
    '        ',
  bindings: [
    '2024-03-08 09:49:42',
    '2024-03-08 09:49:42',
    '2024-03-08 09:49:42',
    '2024-03-08 09:49:42',
    '2024-03-08 09:49:42',
    48
  ]
}

Filesystem used to store the database on: nfs

Please refer to the wiki why using NFS is not a good idea for an database both to prevent db-corruption and performance.

This issue might also be reated to the performance problems of V1 (having to read the entire table) resolved in the upcoming V2.0 release. Please see #4500
=> Have you checked your retention? What is the size of the database?

The database size is 228M and there're 1531440 records in heartbeat table.

Okay, so just NFS being unsuitable for running a database.
Please migrate to local storage instead as suggested in the installation guide.

Thanks for your help. I migrate the storage to hostpath, and it worked.