second MDS fails to deploy

Question

second MDS fails to deploy

slapcat opened this issue 6 months ago · comments

Jake Nabasny, PhD commented 6 months ago

Creating one filesystem works fine, but when a second is created, it remains in a degraded/offline state because an MDS fails to come up for it:

# ceph -s
  cluster:
    id:     05e0f755-cb2a-410e-b565-98793ae95192
    health: HEALTH_ERR
            1 filesystem is offline
            1 filesystem is online with fewer MDS than max_mds

After updating both filesystems to have max_mds = 2, the status changes and the count of MDS increases to 2, but notes 1 is failed:

  cluster:
    id:     05e0f755-cb2a-410e-b565-98793ae95192
    health: HEALTH_ERR
            1 filesystem is degraded
            1 filesystem has a failed mds daemon
            1 filesystem is offline
            2 filesystems are online with fewer MDS than max_mds
 
  services:
    mon: 1 daemons, quorum meatwad (age 6m)
    mgr: meatwad(active, since 6m)
    mds: 1/2 daemons up (1 failed)

What is the correct way to increase the MDS count and/or use multiple filesystems with Microceph?

Utkarsh Bhatt · Answer 1 · Sat Dec 09 2023 12:33:53 GMT+0800 (China Standard Time)

Hey @slapcat! It appears you only have 1 MicroCeph node in the ceph cluster. Adding one more node would automatically spin up extra instances of mon/mgr/mds trio.

Jake Nabasny, PhD · Answer 2 · Sat Dec 09 2023 22:39:37 GMT+0800 (China Standard Time)

Thanks for the explanation! I don't plan on adding more nodes at the moment, so I guess I can use one cephfs with subvolumes instead.

Feel free to close this issue if you don't need it.

Utkarsh Bhatt · Answer 3 · Sun Dec 10 2023 18:39:26 GMT+0800 (China Standard Time)

@slapcat you can also spin-up an lxd container to add as another Microceph node. 😁