seL4 boot fails on riscv64 when higher address values are used

Question

seL4 boot fails on riscv64 when higher address values are used

vqs0214 opened this issue 8 months ago · comments

A simple 'Hello World' app runs on a riscv64 system when the memory node in the dts file uses lower address values, e.g. 0x8000_0000. However, seL4 crashes with a capfault during seL4 boot if the address specified is located at a higher address space, e.g. 0x18_0000_0000.

Memory node in dts for working use case:

        L141: memory@80000000 {
                compatible = "sifive,axi4-mem-port", "sifive,axi4-port", "sifive,mem-port";
                device_type = "memory";
                reg = <0x00000000 0x80000000 0x00000000 0x00400000>;
                sifive,port-width-bytes = <16>;
        };

Memory node in dts for failing use case:

        L141: memory@1800000000 {
                compatible = "sifive,axi4-mem-port", "sifive,axi4-port", "sifive,mem-port";
                device_type = "memory";
                reg = <0x00000018 0x00000000 0x00000000 0x00400000>;
                sifive,port-width-bytes = <16>;
        };

UART output for the failing use case:

paddr=[1800000000..1800059027]
Looking for DTB in CPIO archive...found at 180001d728.
Loaded DTB from 180001d728.
   paddr=[180021f000..1800224fff]
ELF-loading image 'kernel' to 1800200000
  paddr=[1800200000..180021efff]
  vaddr=[ffffffff80200000..ffffffff8021efff]
  virt_entry=ffffffff80200000
ELF-loading image 'hello_world' to 1800225000
  paddr=[1800225000..180035efff]
  vaddr=[10000..149fff]
  virt_entry=1032e
Kernel location: ffffffff80200000
Branching to kernel
Init local IRQ
Bootstrapping kernel
Initialing PLIC...
Booting all finished, dropped to user space
Caught cap fault in send phase at address 0
while trying to handle:
vm fault on code at address 0x1032e with status 0x1at address 0x1032e
With stack:
0x0: INVALID
0x8: INVALID
0x10: INVALID
0x18: INVALID
0x20: INVALID
0x28: INVALID
0x30: INVALID
0x38: INVALID
0x40: INVALID
0x48: INVALID
0x50: INVALID
0x58: INVALID
0x60: INVALID
0x68: INVALID
0x70: INVALID
0x78: INVALID

Ivan Velickovic · Answer 1 · Thu Nov 09 2023 18:37:37 GMT+0800 (China Standard Time)

How do you know it is happening in seL4 and not in the initial task? Can you point to a line of code you believe is failing? (although it seems unlikely for the hello world to depend on where it is being placed...)

Ivan Velickovic · Answer 2 · Thu Nov 09 2023 18:38:32 GMT+0800 (China Standard Time)

Or is it as soon as the initial task starts, a cap fault occurs?

vqs0214 · Answer 3 · Thu Nov 09 2023 18:50:56 GMT+0800 (China Standard Time)

Can you elaborate as to what you mean by initial task? Do you mean application code? I traced execution until it runs schedule() and Arch_switchToThread(). However, I still haven't managed to trace it to the exact code location that causes the capfault. I know for a fact that it does not start running the application code as I put an infinite while loop at the start of it. Hence, I figured it should have faulted somewhere in the seL4 code.

To repeat my previous statement, this works if I use a memory address < 0x1_0000_0000.

Ivan Velickovic · Answer 4 · Thu Nov 09 2023 19:26:33 GMT+0800 (China Standard Time)

By initial task I just meant the first user-space thread that runs, seL4 typically refers to the first application/user code as the "root task" or "initial task".

I notice that 0x1_0000_0000 is 2^32 so perhaps there's some integer overflow happening (given the boot code is not verified). It may be that there's not a single line causing the cap fault but the initial task is not being setup properly by seL4 which means that the initial task doesn't execute successfully.

Indan Zupancic · Answer 5 · Thu Nov 09 2023 20:01:44 GMT+0800 (China Standard Time)

You could disassemble address 0x1032e in hello_world and see what it does there.

vqs0214 · Answer 6 · Thu Nov 09 2023 20:04:55 GMT+0800 (China Standard Time)

You could disassemble address 0x1032e in hello_world and see what it does there.

Yes I did. It is the start of the application and runs the function prologue. The first instruction substracts sp by an offset.

Indan Zupancic · Answer 7 · Thu Nov 09 2023 20:39:20 GMT+0800 (China Standard Time)

Yes I did. It is the start of the application and runs the function prologue. The first instruction substracts sp by an offset.

My apologies, I should have spotted virt_entry=1032e before asking.

The cap fault is because the root task has no fault handler and should be ignored. The original fault was the vm one. The very first instruction executed faults, which seems to imply that hello_world wasn't mapped properly. This could be a bug in seL4's bootup code.

Does increasing KernelPTLevels in your kernel configuration help? (Assuming you can change it, it seems a bit unclear.)

vqs0214 · Answer 8 · Thu Nov 09 2023 20:44:50 GMT+0800 (China Standard Time)

Yes I did. It is the start of the application and runs the function prologue. The first instruction substracts sp by an offset.

My apologies, I should have spotted virt_entry=1032e before asking.

The cap fault is because the root task has no fault handler and should be ignored. The original fault was the vm one. The very first instruction executed faults, which seems to imply that hello_world wasn't mapped properly. This could be a bug in seL4's bootup code.

Does increasing KernelPTLevels in your kernel configuration help? (Assuming you can change it, it seems a bit unclear.)

I can try increasing KernlPTLevels but I doubt if I can do it directly. I think it will autoamtically increase if I change the MMU setting to sv48 from sv39. From what I understand, seL4 does not yet support sv48 either.

Indan Zupancic · Answer 9 · Thu Nov 09 2023 20:54:25 GMT+0800 (China Standard Time)

Could you double check that the current value is 3? Unlikely it isn't and 39 bits should be enough, but I didn't find anything else suspicious in the code.

vqs0214 · Answer 10 · Thu Nov 09 2023 21:01:21 GMT+0800 (China Standard Time)

Could you double check that the current value is 3? Unlikely it isn't and 39 bits should be enough, but I didn't find anything else suspicious in the code.

Yes, the current value is 3.

Indan Zupancic · Answer 11 · Thu Nov 09 2023 21:36:16 GMT+0800 (China Standard Time)

There is a bug in the handleDoubleFault function, it omitted a space, so the status is 0x1 and means RISCVInstructionAccessFault. See src/arch/riscv/kernel/vspace.c:432. That confirms what I suspected, but doesn't give new information.

Can you test with an up-to-date kernel and see if you still get the fault? The "Initialing PLIC" typo was fixed in 2021.

Gerwin Klein · Answer 12 · Fri Nov 10 2023 05:27:19 GMT+0800 (China Standard Time)

Not that this will help, but I wanted to point out that there is no expectation that any arbitrary start addresses will work. The verification checks and validates the one that is supplied for the specific verified platform, but there are a bunch of conditions that need to be satisfied wrt to other memory regions (alignment, non-overlap, relationship between kernel regions and user regions) that are not necessarily checked in the code.

That said, off-hand I don't see anything wrong with the one supplied here.

Ivan Velickovic · Answer 13 · Fri Nov 10 2023 06:01:33 GMT+0800 (China Standard Time)

Haven't thought about it too hard but given this 3392a3f patch is after the typo fix, maybe this is the issue. I didn't realise that an old kernel version was being used.

Kent McLeod · Answer 14 · Fri Nov 10 2023 06:01:40 GMT+0800 (China Standard Time)

RISCVInstructionAccessFault

I believe this means that it's an issue with the physical memory protection tables not having access mappings for the high addresses. (See here for a description of the difference between Instruction access fault and Instruction page fault).

PMP access control settings are platform-specific and need to be correctly set up by the m-mode implementation. The seL4 kernel doesn't access them from s-mode.

Kent McLeod · Answer 15 · Fri Nov 10 2023 06:06:41 GMT+0800 (China Standard Time)

Haven't thought about it too hard but given this 3392a3f patch is after the typo fix, maybe this is the issue. I didn't realise that an old kernel version was being used.

(Marked my comment as outdated, because 3392a3f seems to quite likely be the issue.)

Ivan Velickovic · Answer 16 · Fri Nov 10 2023 06:16:40 GMT+0800 (China Standard Time)

Haven't thought about it too hard but given this 3392a3f patch is after the typo fix, maybe this is the issue. I didn't realise that an old kernel version was being used.

(actually I shouldn't say 'old' kernel since the latest seL4 release would not contain this patch).

Ivan Velickovic · Answer 17 · Fri Nov 10 2023 08:00:11 GMT+0800 (China Standard Time)

I applied the following patch to seL4:

diff --git a/src/plat/qemu-riscv-virt/overlay-qemu-riscv-virt.dts b/src/plat/qemu-riscv-virt/overlay-qemu-riscv-virt.dts
index 34a43eaf2..3e71d4a97 100644
--- a/src/plat/qemu-riscv-virt/overlay-qemu-riscv-virt.dts
+++ b/src/plat/qemu-riscv-virt/overlay-qemu-riscv-virt.dts
@@ -26,4 +26,11 @@
             reg = <0x00000000 0x2000000 0x00000000 0x000010000>;
         };
     };
+
+    /delete-node/ memory@80000000;
+
+    memory {
+        device_type = "memory";
+        reg = <0x1 0x00 0x00 0x10000000>;
+    };
 };

in order to get the QEMU RISC-V virt platform to have RAM start at 0x1_0000_0000. I could not reproduce the issue with the latest kernel, removing 3392a3f and rebuilding causes a cap fault to occur. Could you please upgrade the kernel version (as @Indanz suggested) or just apply 3392a3f and report the result? Thanks!

Ivan Velickovic · Answer 18 · Fri Nov 10 2023 08:44:02 GMT+0800 (China Standard Time)

From what I understand, seL4 does not yet support sv48 either.

I do remember some discussion Mattermost about this and that there was issues with running seL4 configured for Sv39 on a Sv48 core. I can't reproduce this unfortunately since I don't have access to any non-Sv39 hardware. If it's still a problem for you, feel free to post another GitHub issue and we can investigate there.

Indan Zupancic · Answer 19 · Fri Nov 10 2023 21:00:36 GMT+0800 (China Standard Time)

Closing this as it's most likely resolved already. Feel free to re-open if you can reproduce the problem with an up-to-date kernel.

vqs0214 · Answer 20 · Mon Nov 13 2023 17:52:32 GMT+0800 (China Standard Time)

I applied the patch 3392a3f but I still see the same issue. Let me try with an updated kernel version, which will take me a bit of time.

vqs0214 · Answer 21 · Fri Dec 08 2023 21:03:14 GMT+0800 (China Standard Time)

Hello. I upgraded the kernel version to the latest. I still see the same issue. Hence, I am re-opening this issue.

Actually, the upgraded kernel version also caused a failure on lower address regions. This use case worked on the older kernel version. Given below is the uart output for physical address space starting from 0x8000_0000. Please note that the fault occurs while trying to branch to the hello world app at 0x1032e. I can confirm that the contents of the corresponding physical address are loaded with the correct instructions.

Entering Op paddr=[1800000000..1800059027]
Looking for DTB in CPIO archive...found at 180001d878.
Loaded DTB from 180001d878.
paddr=[8001e000..80023fff]
ELF-loading image 'kernel' to 80000000
paddr=[80000000..8001dfff]
vaddr=[ffffffff80000000..ffffffff8001dfff]
virt_entry=ffffffff80000000
ELF-loading image 'hello_world' to 80024000
paddr=[80024000..8015dfff]
vaddr=[10000..149fff]
virt_entry=1032e
Kernel location: ffffffff80000000
Branching to kernel
Init local IRQ
Bootstrapping kernel
Initializing PLIC...
available phys memory regions: 1
[80000000..80400000]
reserved virt address space regions: 3
[ffffffc080000000..ffffffc08001e000]
[ffffffc08001e000..ffffffc0800233d6]
[ffffffc080024000..ffffffc08015e000]
Booting all finished, dropped to user space
Caught cap fault in send phase at address 0
while trying to handle:
vm fault on code at address 0x1032e with status 0x1at address 0x1032e
With stack:
0x0: INVALID
0x8: INVALID
0x10: INVALID
0x18: INVALID
0x20: INVALID
0x28: INVALID
0x30: INVALID
0x38: INVALID
0x40: INVALID
0x48: INVALID
0x50: INVALID
0x58: INVALID
0x60: INVALID
0x68: INVALID
0x70: INVALID
0x78: INVALID

vqs0214 · Answer 22 · Fri Dec 08 2023 21:08:02 GMT+0800 (China Standard Time)

Hello. I upgraded the kernel version to the latest. I still see the same issue. Hence, I am re-opening this issue.

Actually, the upgraded kernel version also caused a failure on lower address regions. This use case worked on the older kernel version. Given below is the uart output for physical address space starting from 0x8000_0000. Please note that the fault occurs while trying to branch to the hello world app at 0x1032e. I can confirm that the contents of the corresponding physical address are loaded with the correct instructions.

Entering Op paddr=[1800000000..1800059027] Looking for DTB in CPIO archive...found at 180001d878. Loaded DTB from 180001d878. paddr=[8001e000..80023fff] ELF-loading image 'kernel' to 80000000 paddr=[80000000..8001dfff] vaddr=[ffffffff80000000..ffffffff8001dfff] virt_entry=ffffffff80000000 ELF-loading image 'hello_world' to 80024000 paddr=[80024000..8015dfff] vaddr=[10000..149fff] virt_entry=1032e Kernel location: ffffffff80000000 Branching to kernel Init local IRQ Bootstrapping kernel Initializing PLIC... available phys memory regions: 1 [80000000..80400000] reserved virt address space regions: 3 [ffffffc080000000..ffffffc08001e000] [ffffffc08001e000..ffffffc0800233d6] [ffffffc080024000..ffffffc08015e000] Booting all finished, dropped to user space Caught cap fault in send phase at address 0 while trying to handle: vm fault on code at address 0x1032e with status 0x1at address 0x1032e With stack: 0x0: INVALID 0x8: INVALID 0x10: INVALID 0x18: INVALID 0x20: INVALID 0x28: INVALID 0x30: INVALID 0x38: INVALID 0x40: INVALID 0x48: INVALID 0x50: INVALID 0x58: INVALID 0x60: INVALID 0x68: INVALID 0x70: INVALID 0x78: INVALID

It looks like the capfault is due to exception error 0x1 which is instruction access fault. I will try to see if this is a RISCV PTE issue. PMP is set by OpenSBI and I can confirm that the region that is loaded with the app code has execute permissions.

Indan Zupancic · Answer 23 · Fri Dec 08 2023 22:57:17 GMT+0800 (China Standard Time)

Could you do a git bisect between the commit that works and main to find which commit broke this for you? Try to keep everything else the same.

vqs0214 · Answer 24 · Wed Dec 13 2023 22:21:14 GMT+0800 (China Standard Time)

Hey. Please close this issue. The elfloader tool used a different approach to computing vptr and once I made the necessary modifications to the elfloader tool, I was able to get this to work even for higher addresses. Thanks for your help.

Axel Heider · Answer 25 · Wed Dec 13 2023 22:23:50 GMT+0800 (China Standard Time)

So, does this mean there is a patch for the ELF loader we should consider?

vqs0214 · Answer 26 · Wed Dec 13 2023 22:25:32 GMT+0800 (China Standard Time)

No, I just took the changes that were made to elfloader when the high address issue was resolved and just applied them to my local version. I think we are good here.