pmem / ndctl

A "device memory" enabling project encompassing tools and libraries for CXL, NVDIMMs, DAX, memory tiering and other platform memory device topics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fail to create devdax namespace

XuLiDown opened this issue · comments

Environment

  • kernel version: 5.1.0
  • cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
  • device information:
    I have already created fsdax namespace in region 0 and region 1 respectively, each of 128GB
    image

image

Problem

i want to create an interleave mode on a region, just like:
image

I want use some of capacity to create devdax, other capacity is used for fsdax. So, first, I created fsdax namespace in region 0 and region 1 respectively by:

ndctl create-namespace --mode=fsdax --size=128G --region=region0 --force 
ndctl create-namespace --mode=fsdax --size=128G --region=region1 --force 

after command ndctl list --namespaces --regions, we can see:

"regions":[
    {
      "dev":"region1",
      "size":270582939648,
      "available_size":133143986176,
      "max_available_extent":133143986176,
      "type":"pmem",
      "iset_id":-3527463406870649788,
      "persistence_domain":"memory_controller",
      "namespaces":[
        {
          "dev":"namespace1.0",
          "mode":"fsdax",
          "map":"dev",
          "size":135289372672,
          "uuid":"cafa287e-f91d-43b6-8082-e7574527aaa8",
          "sector_size":512,
          "align":2097152,
          "blockdev":"pmem1"
        }
      ]
    },
    {
      "dev":"region0",
      "size":270582939648,
      "available_size":133143986176,
      "max_available_extent":133143986176,
      "type":"pmem",
      "iset_id":-6768929236689279932,
      "persistence_domain":"memory_controller",
      "namespaces":[
        {
          "dev":"namespace0.0",
          "mode":"fsdax",
          "map":"dev",
          "size":135289372672,
          "uuid":"2742c237-1d96-4084-9242-908a2817d815",
          "sector_size":512,
          "align":2097152,
          "blockdev":"pmem0"
        }
      ]
    }
  ]
}

and then i tried to create devdax mode namespace, However it fails. Steps are as below:

[root@.....]# ndctl create-namespace --mode=devdax --size=100G --region=region0 --align=2m --force
libndctl: ndctl_dax_enable: dax0.0: failed to enable
Error: namespace0.1: failed to enable
failed to create namespace: No such device or address

[root@.....]# ndctl create-namespace --mode=devdax --size=100G --region=region1 --align=2m --force
Error: create namespace: namespace1.2: set_size failed: No such device or address

(by the way, I don't understand why the error message of region 0 is different from region 1)

Could any one gives me some suggestions to solve this problem? Thank you all!

Some/most of these related issues have been fixed in newer Kernel releases. Kernel 5.1 is quite old, so if you have an option to try a newer Linux distro or a newer Kernel, that will likely resolve the problem. Notably, the incorrect 'FreeCapacity' value for Region1 was fixed in a newer Kernel (IIRC around 5.4 or 5.5).

  • What do you see in dmesg when the failures occur?
  • What do you see in the debug logs? (add -v to the ndctl command)

Some/most of these related issues have been fixed in newer Kernel releases. Kernel 5.1 is quite old, so if you have an option to try a newer Linux distro or a newer Kernel, that will likely resolve the problem. Notably, the incorrect 'FreeCapacity' value for Region1 was fixed in a newer Kernel (IIRC around 5.4 or 5.5).

  • What do you see in dmesg when the failures occur?
  • What do you see in the debug logs? (add -v to the ndctl command)

Here is the message after adding -v to the ndctl command:

 [root@.....]# ndctl -v create-namespace --mode=devdax --size=100G --region=region0 --force
71.2.gea014c0

and here is what i see in dmesg when the failures occur

[19853.414211] kauditd_printk_skb: 38 callbacks suppressed
[19853.414213] audit: type=1400 audit(1659968753.236:50): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/snapd/snap-confine" pid=1701 comm="apparmor_parser"
[19853.427361] audit: type=1400 audit(1659968753.252:51): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=1701 comm="apparmor_parser"
[19854.803994] audit: type=1400 audit(1659968754.628:52): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="snap.clion.clion" pid=1767 comm="apparmor_parser"
[19854.805064] audit: type=1400 audit(1659968754.628:53): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="snap-update-ns.clion" pid=1766 comm="apparmor_parser"
[19854.806298] audit: type=1400 audit(1659968754.628:54): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/snap/snapd/15177/usr/lib/snapd/snap-confine" pid=1763 comm="apparmor_parser"
[19854.806304] audit: type=1400 audit(1659968754.628:55): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/snap/snapd/15177/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=1763 comm="apparmor_parser"
[19854.806617] audit: type=1400 audit(1659968754.628:56): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/snap/snapd/15534/usr/lib/snapd/snap-confine" pid=1764 comm="apparmor_parser"
[19854.806623] audit: type=1400 audit(1659968754.628:57): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/snap/snapd/15534/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=1764 comm="apparmor_parser"
[19854.807543] audit: type=1400 audit(1659968754.632:58): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/snap/snapd/16292/usr/lib/snapd/snap-confine" pid=1765 comm="apparmor_parser"
[19854.807548] audit: type=1400 audit(1659968754.632:59): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/snap/snapd/16292/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=1765 comm="apparmor_parser"

it seems i haven't enabled the debug log, but i am new for ndctl, so i don't know how to enable it

Thanks. I'd expect to see something in dmesg. You could look for the following to see if any errors are reported.

dmesg | grep -Ei "pmem|nvdimm|nd_|dax|namespace|nfit"

Regarding ndctl debug logs, you'd need to build ndctl with debugging enabled using the source code. It's not too difficult. See Installing NDCTL, DAXCTL, and CXL-CLI from Source on Linux.

Personally, I'd look towards updating the OS and Kernel since that's likely what you'll need to do anyway when root cause is identified.

I have updated the kernel to 5.19, but that issue still occurs and the incorrect 'FreeCapacity' value for Region still remains.

root@ubuntu:/# uname -r
5.19.0

So i want to reinstall ndctl, but another issues occur. Steps are as below:

root@ubuntu:/usr/src# sudo apt install -y git gcc g++ autoconf automake asciidoc asciidoctor bash-completion xmlto libtool pkg-config libglib2.0-0 libglib2.0-dev libfabric1 libfabric-dev doxygen graphviz pandoc libncurses5 libkmod2 libkmod-dev libudev-dev uuid-dev libjson-c-dev libkeyutils-dev libiniparser libiniparser-dev bc meson
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libiniparser

root@ubuntu:/usr/src# mkdir ndctl
root@ubuntu:/usr/src# cd ndctl/

root@ubuntu:/usr/src/ndctl# git clone https://github.com/pmem/ndctl
Cloning into 'ndctl'...
remote: Enumerating objects: 9852, done.
remote: Counting objects: 100% (1780/1780), done.
remote: Compressing objects: 100% (601/601), done.
remote: Total 9852 (delta 1214), reused 1692 (delta 1176), pack-reused 8072
Receiving objects: 100% (9852/9852), 3.20 MiB | 89.00 KiB/s, done.
Resolving deltas: 100% (7179/7179), done.

root@ubuntu:/usr/src/ndctl# cd ndctl/

root@ubuntu:/usr/src/ndctl/ndctl# meson setup build buildtype[debug]
Error during basic setup:
Neither directory contains a build file meson.build.

root@ubuntu:/usr/src/ndctl/ndctl# meson setup buildtype[debug]
The Meson build system
Version: 0.45.1
Source dir: /usr/src/ndctl/ndctl
Build dir: /usr/src/ndctl/ndctl/buildtype[debug]
Build type: native build
meson_options.txt:3:0: ERROR: Unknown type feature.

how can i fix it? Could you offer me some suggetions?

What Ubuntu release are you using? The libiniparser package is available in Ubuntu 18.04 or later from the 'Universe' package repo. If you have the option of going to Ubuntu 20.04 or later, you should end up with Kernel 5.8 (or newer) which should resolve your current problems.

I was using ubuntu 16.04, but after updating to ubuntu 20.04, the libiniparser package is still unavailable. And i found libiniparser1 package in https://packages.ubuntu.com/, but there is no libiniparser package

"meson_options.txt:3:0: ERROR: Unknown type feature" have been fixed by updating meson

root@xulidang:~# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.3 LTS"

root@xulidang:~# sudo apt install -y git gcc g++ autoconf automake asciidoc asciidoctor bash-completion xmlto libtool pkg-config libglib2.0-0 libglib2.0-dev libfabric1 libfabric-dev doxygen graphviz pandoc libncurses5 libkmod2 libkmod-dev libudev-dev uuid-dev libjson-c-dev libkeyutils-dev libiniparser libiniparser-dev bc meson
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libiniparser

Thanks for the update. They must have changed the package name after I wrote the docs. I'll update the docs accordingly. Looks like they kept the libiniparser-dev and libiniparser-doc names though.

Q) With the updated OS and Kernel, is the original problem still present? Make sure you're using the latest ndctl and ipmctl packages available for that distro.

when i ran the command sudo meson install -C build, a new problem occured:

root@xulidang:/usr/src/ndctl# sudo meson install -C build
ninja: Entering directory `build'
[1/194] Compiling C object 'util/9342af2@@util@sta/parse-configs.c.o'.
FAILED: util/9342af2@@util@sta/parse-configs.c.o 
cc -Iutil/9342af2@@util@sta -Iutil -I../util -I. -I../ -Indctl -I../ndctl -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=gnu99 -g -Wall -Wchar-subscripts -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wshadow -Wsign-compare -Wstrict-prototypes -Wtype-limits -Wmaybe-uninitialized -Wdeclaration-after-statement -Wunused-result -D_FORTIFY_SOURCE=2 -O2 -include config.h -fPIC -MD -MQ 'util/9342af2@@util@sta/parse-configs.c.o' -MF 'util/9342af2@@util@sta/parse-configs.c.o.d' -o 'util/9342af2@@util@sta/parse-configs.c.o' -c ../util/parse-configs.c
../util/parse-configs.c:7:10: fatal error: iniparser.h: No such file or directory
    7 | #include <iniparser.h>
      |          ^~~~~~~~~~~~~
compilation terminated.
[10/194] Generating asciidoctor-extensions.rb with a meson_exe.py custom command.
ninja: build stopped: subcommand failed.
Could not rebuild build

It seems like libiniparser1 is different from libiniparser

commented

I met this similar problem with you. My problem is solved after I upgraded to kernel version 5.15