altlinux / linux-arm

Russian ARM SoCs (BE-M1000, MCom-03, etc) support patches for Linux kernel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Second NVME disk is not initialized in 2 x M2 disks configuration

int3264 opened this issue · comments

For the Baikal-M board with two M2 PCIe disks the second disk is not initialized.
Linux version 5.10.82-std-def-alt1 (builder@localhost.localdomain) (gcc-10 (GCC) 10.3.1 20210703 (ALT Sisyphus 10.3.1-alt2), GNU ld (GNU Binutils) 2.35.2.20210110) #1 SMP Fri Dec 3 14:50:06 UTC 2021
Here is the dmesg output for the case:

[    1.763823] ------------[ cut here ]------------
[    1.763886] WARNING: CPU: 0 PID: 134 at kernel/irq/manage.c:2036 request_threaded_irq+0x160/0x1b0
[    1.763961] Modules linked in:
[    1.763998] CPU: 0 PID: 134 Comm: kworker/u16:2 Not tainted 5.10.82-std-def-alt1 #1
[    1.764063] Hardware name: Delta Computers Bober/Rhodeola, BIOS 5.3 01/11/2022
[    1.764079] arm-ccn 9000000.ccn: No access to interrupts, using timer.
[    1.764137] Workqueue: nvme-reset-wq nvme_reset_work
[    1.764230] pstate: a0400005 (NzCv daif +PAN -UAO -TCO BTYPE=--)
[    1.764283] pc : request_threaded_irq+0x160/0x1b0
[    1.764327] lr : request_threaded_irq+0x80/0x1b0
[    1.764368] sp : ffff800012d1bb60
[    1.764400] x29: ffff800012d1bb60 x28: 0000000000000000 
[    1.764451] x27: ffff000800ed0500 x26: 0000000000000000 
[    1.764501] x25: 0000000000000000 x24: 0000000000000002 
[    1.764551] x23: ffff0008031b6800 x22: ffff80001093b8b0 
[    1.764601] x21: ffff000800107400 x20: 0000000000000080 
[    1.764707] NET: Registered protocol family 10
[    1.766975] x19: ffff80001093b8b0 x18: 0000000000000020 
[    1.766981] x17: 0000000000000001 x16: 0000000000000019 
[    1.766986] x15: ffffffffffffffff x14: ffff000800ed0508 
[    1.767821] nvme nvme0: 1/0/0 default/read/poll queues
[    1.769998] Segment Routing with IPv6

[    1.771707] x13: ffffffffffffffff x12: 0000000000000040 
[    1.771712] x11: ffff000800400240 x10: ffff000800400242 
[    1.771721] x9 : ffff800011d88410 
[    1.774063] RPL Segment Routing with IPv6
[    1.776351] x8 : ffff000800400268 
[    1.779041] registered taskstats version 1
[    1.780901] x7 : 0000000000000000 x6 : ffff000800400270 
[    1.780906] x5 : ffff000800400240 x4 : ffff000800400278 
[    1.780910] x3 : 0000000000000000 x2 : 0000000000000000 
[    1.783190] Loading compiled-in X.509 certificates

[    1.785446] x1 : 0000000000000002 x0 : 0000000000131600 
[    1.789262] Loaded X.509 cert 'Build time autogenerated kernel key: 4e78bc91b859ec08082639ec20b1616089bf8910'

[    1.789999] Call trace:
[    1.790009]  request_threaded_irq+0x160/0x1b0
[    1.796434] zswap: loaded using pool zstd/zbud
[    1.796732]  pci_request_irq+0xc0/0x110
[    1.796744]  queue_request_irq+0x78/0x8c
[    1.799392] Key type ._fscrypt registered
[    1.801240]  nvme_reset_work+0x488/0x1580
[    1.801247]  process_one_work+0x1e4/0x4ac
[    1.801250]  worker_thread+0x170/0x524
[    1.801255]  kthread+0x130/0x13c
[    1.801265]  ret_from_fork+0x10/0x38
[    1.803524] Key type .fscrypt registered
[    1.805733] ---[ end trace fed6bc90951fb949 ]---
[    1.805798] nvme nvme1: Removing after probe failure status: -22

The following dts configuration for PCIe was used:

pcie0: pcie@2200000 { /* PCIe x4 #0 */
			compatible = "baikal,pcie-m", "snps,dw-pcie";
			reg = <0x0 0x02200000 0x0 0x1000>,   /* RC config space */
			      <0x0 0x40100000 0x0 0x100000>; /* PCI config space */
			reg-names = "dbi", "config";
			interrupts = <GIC_SPI 426 IRQ_TYPE_LEVEL_HIGH>, /* AER */
				     <GIC_SPI 429 IRQ_TYPE_LEVEL_HIGH>; /* MSI */
			#interrupt-cells = <1>;
			baikal,pcie-lcru = <&pcie_lcru 0>;
			#address-cells = <3>;
			#size-cells = <2>;
			device_type = "pci";
			ranges = <0x81000000 0x0 0x00000000 0x0 0x40200000 0x0 0x100000>,   /* I/O */
				 <0x82000000 0x0 0x40000000 0x4 0x00000000 0x0 0x40000000>; /* 32b non-prefetchable memory */
			msi-parent = <&its 0x0>;
			msi-map = <0x0 &its 0x0 0x10000>;
			num-lanes = <4>;
			num-viewport = <4>;
			bus-range = <0x0 0xff>;
			status = "disabled";
		};

		pcie1: pcie@2210000 { /* PCIe x4 #1 */
			compatible = "baikal,pcie-m", "snps,dw-pcie";
			reg = <0x0 0x02210000 0x0 0x1000>,   /* RC config space */
			      <0x0 0x50100000 0x0 0x100000>; /* PCI config space */
			reg-names = "dbi", "config";
			interrupts = <GIC_SPI 402 IRQ_TYPE_LEVEL_HIGH>,	/* AER */
				     <GIC_SPI 405 IRQ_TYPE_LEVEL_HIGH>;	/* MSI */
			#interrupt-cells = <1>;
			baikal,pcie-lcru = <&pcie_lcru 1>;
			#address-cells = <3>;
			#size-cells = <2>;
			device_type = "pci";
			ranges = <0x81000000 0x0 0x00100000 0x0 0x50200000 0x0 0x100000>,   /* I/O */
				 <0x82000000 0x0 0x40000000 0x5 0x00000000 0x0 0x40000000>; /* 32b non-prefetchable memory */
			msi-parent = <&its 0x0>;
			msi-map = <0x0 &its 0x0 0x10000>;
			num-lanes = <4>;
			num-viewport = <4>;
			bus-range = <0x0 0xff>;
			status = "disabled";
		};

		pcie2: pcie@2220000 { /* PCIe x8 */
			compatible = "baikal,pcie-m", "snps,dw-pcie";
			reg = <0x0 0x02220000 0x0 0x1000>,   /* RC config space */
			      <0x0 0x60000000 0x0 0x100000>; /* PCI config space */
			reg-names = "dbi", "config";
			interrupts = <GIC_SPI 378 IRQ_TYPE_LEVEL_HIGH>, /* AER */
				     <GIC_SPI 381 IRQ_TYPE_LEVEL_HIGH>; /* MSI */
			#interrupt-cells = <1>;
			baikal,pcie-lcru = <&pcie_lcru 2>;
			#address-cells = <3>;
			#size-cells = <2>;
			device_type = "pci";
			ranges = <0x81000000 0x0 0x00200000 0x0 0x60100000 0x0 0x100000>,   /* I/O */
				 <0x82000000 0x0 0x80000000 0x6 0x00000000 0x0 0x80000000>; /* 32b non-prefetchable memory */
			msi-parent = <&its 0x0>;
			msi-map = <0x0 &its 0x0 0x10000>;
			num-lanes = <8>;
			num-viewport = <4>;
			bus-range = <0x0 0xff>;
			status = "disabled";
		};

M2 disks are using pcie0 and pcie1 (both x4).
When using one M2 disk - it is working in any of PCIe slot (pcie0 or pcie1). This does not depend on the vendor of M2 disk.

BR, Ilya.

Hardware name: Delta Computers Bober/Rhodeola, BIOS 5.3 01/11/2022

At the moment baikalm kernels support only TF307 board.
TF307 board has a single m2 slot, so using 2 or more nvme drives has never been tested.

The following dts configuration for PCIe was used:

I have no idea if that describes the hardware correctly. For one I'm not a hardware engineer.
And I don't have neither datasheets, no the board available.

Hello Alexey,

Hardware name: Delta Computers Bober/Rhodeola, BIOS 5.3 01/11/2022

At the moment baikalm kernels support only TF307 board. TF307 board has a single m2 slot, so using 2 or more nvme drives has never been tested.

Could you please specify is it kernel not compatible with two M2 PCIe or baikalm driver? As I may see from logs it is kernel error.
Moreover EDK2 UEFI NVME driver (src/uefi/MdeModulePkg/Bus/Pci/NvmExpressDxe) correctly detects both NVME M2 PCIe devices. See the log below:

NvmExpressDriverBindingStart: start
Cc.En: 0
Cc.Css: 0
Cc.Mps: 0
Cc.Ams: 0
Cc.Shn: 0
Cc.Iosqes: 0
Cc.Iocqes: 0
NVMe controller is disabled with status [Success].
Private->Buffer = [00000000F9CC6000]
Admin     Submission Queue size (Aqa.Asqs) = [00000001]
Admin     Completion Queue size (Aqa.Acqs) = [00000001]
Admin     Submission Queue (SqBuffer[0]) = [00000000F9CC6000]
Admin     Completion Queue (CqBuffer[0]) = [00000000F9CC7000]
Sync  I/O Submission Queue (SqBuffer[1]) = [00000000F9CC8000]
Sync  I/O Completion Queue (CqBuffer[1]) = [00000000F9CC9000]
Async I/O Submission Queue (SqBuffer[2]) = [00000000F9CCA000]
Async I/O Completion Queue (CqBuffer[2]) = [00000000F9CCB000]
Aqa.Asqs: 1
Aqa.Acqs: 1
Asq: F9CC6000
Acq: F9CC7000h
Cc.En: 1
Cc.Css: 0
Cc.Mps: 0
Cc.Ams: 0
Cc.Shn: 0
Cc.Iosqes: 6
Cc.Iocqes: 4
NVMe controller is enabled with status [Success].
 == NVME IDENTIFY CONTROLLER DATA ==
    PCI VID   : 0x8086
    PCI SSVID : 0x8086
    SN        : BTPY7425080F256D    
    MN        : INTEL SSDPEKKA256G7                     
    FR        : 0x46535020
    RAB       : 0x6
    IEEE      : 0x5CD2E4
    AERL      : 0x7
    SQES      : 0x66
    CQES      : 0x44
    NN        : 0x1
 == NVME IDENTIFY NAMESPACE [1] DATA ==
    NSZE        : 0x1DCF32B0
    NCAP        : 0x1DCF32B0
    NUSE        : 0x1DCF32B0
    LBAF0.LBADS : 0x9
NvmExpressDriverBindingStart: end successfully
 BlockSize : 512 
 LastBlock : 1DCF32AF 
 Valid efi partition table header
 Valid efi partition table header
 Valid primary and Valid backup partition table
 Partition entries read block success
 Number of partition entries: 128
 start check partition entries
 End check partition entries
 Index : 0
 Start LBA : 800
 End LBA : 1007FF
 Partition size: 100000
 Start : 100000 End : 200FFE00
 Index : 1
 Start LBA : 100800
 End LBA : 1DCF2FFF
 Partition size: 1DBF2800
 Start : 20100000 End : 3B9E5FFE00
Prepare to Free Pool
 BlockSize : 512 
 LastBlock : FFFFF 
Installed Fat filesystem on F9CC2398
 BlockSize : 512 
 LastBlock : 1DBF27FF 
NvmExpressDriverBindingStart: start
Cc.En: 0
Cc.Css: 0
Cc.Mps: 0
Cc.Ams: 0
Cc.Shn: 0
Cc.Iosqes: 0
Cc.Iocqes: 0
NVMe controller is disabled with status [Success].
Private->Buffer = [00000000F981A000]
Admin     Submission Queue size (Aqa.Asqs) = [00000001]
Admin     Completion Queue size (Aqa.Acqs) = [00000001]
Admin     Submission Queue (SqBuffer[0]) = [00000000F981A000]
Admin     Completion Queue (CqBuffer[0]) = [00000000F981B000]
Sync  I/O Submission Queue (SqBuffer[1]) = [00000000F981C000]
Sync  I/O Completion Queue (CqBuffer[1]) = [00000000F981D000]
Async I/O Submission Queue (SqBuffer[2]) = [00000000F981E000]
Async I/O Completion Queue (CqBuffer[2]) = [00000000F981F000]
Aqa.Asqs: 1
Aqa.Acqs: 1
Asq: F981A000
Acq: F981B000h
Cc.En: 1
Cc.Css: 0
Cc.Mps: 0
Cc.Ams: 0
Cc.Shn: 0
Cc.Iosqes: 6
Cc.Iocqes: 4
NVMe controller is enabled with status [Success].
 == NVME IDENTIFY CONTROLLER DATA ==
    PCI VID   : 0x8086
    PCI SSVID : 0x8086
    SN        : BTPY742509BD256D    
    MN        : INTEL SSDPEKKA256G7                     
    FR        : 0x46535020
    RAB       : 0x6
    IEEE      : 0x5CD2E4
    AERL      : 0x7
    SQES      : 0x66
    CQES      : 0x44
    NN        : 0x1
 == NVME IDENTIFY NAMESPACE [1] DATA ==
    NSZE        : 0x1DCF32B0
    NCAP        : 0x1DCF32B0
    NUSE        : 0x1DCF32B0
    LBAF0.LBADS : 0x9
NvmExpressDriverBindingStart: end successfully
 BlockSize : 512 
 LastBlock : 1DCF32AF 
 BlockSize : 512 
 LastBlock : FFFFF 
Installed Fat filesystem on F9CB2518
 BlockSize : 512 
 LastBlock : 1DBF2001 
 BlockSize : 512 
 LastBlock : 1DBF1FFF 
NvmExpressDriverBindingStart: start
NvmExpressDriverBindingStart: end successfully

The following dts configuration for PCIe was used:

I have no idea if that describes the hardware correctly. For one I'm not a hardware engineer. And I don't have neither datasheets, no the board available.

This is general configuration from Baikal SDK5.3. The fact that M2 drive is working in any of M2 slots (if it is only one installed) says that the issue is software/OS.

BR, Ilya Smirnov.