google / gvisor

Application Kernel for Containers

Home Page:https://gvisor.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

arm64: kvm_test is blocked on TestSafecopySigbus.

avagin opened this issue · comments

Description

#6573 (comment)

Steps to reproduce

  • Remove the amd64 build tag from the pkg/sentry/platform/kvm/kvm_safecopy_test.go
  • Run kvm_test on arm64
commented

is that possible to run the test under arm64 if I don't have a machine with the architecture?

you can try to create a qemu aem64 vm: https://wiki.ubuntu.com/ARM64/QEMU

but I am not sure that kvm tests will work there.

cc: @zhlhahaha

Yes, just as @avagin comment, you can try to run create arm64 VM on x86 via QEMU, but it does not support KVM.
For the arm64 machine, you can try to run gVisor on raspberry pi 4 if you have one.

# bazel-out/aarch64-fastbuild-ST-4c64f0b3d5c7/bin/pkg/sentry/platform/kvm/kvm_test_/kvm_test -test.v -test.run=TestSafecopySigbus
=== RUN   TestSafecopySigbus
I1015 17:53:31.660636    5123 physical_map.go:124] region: virtual [fef367c000,ffff7367c000)
I1015 17:53:31.660820    5123 physical_map.go:176] physicalRegion: virtual [1000,10000) => physical [1000,10000)
I1015 17:53:31.660831    5123 physical_map.go:176] physicalRegion: virtual [10000,295000) => physical [10000,295000)
I1015 17:53:31.660839    5123 physical_map.go:176] physicalRegion: virtual [295000,2a0000) => physical [295000,2a0000)
I1015 17:53:31.660846    5123 physical_map.go:176] physicalRegion: virtual [2a0000,578000) => physical [2a0000,578000)
I1015 17:53:31.660853    5123 physical_map.go:176] physicalRegion: virtual [578000,fef367c000) => physical [578000,fef367c000)
I1015 17:53:31.660861    5123 physical_map.go:176] physicalRegion: virtual [ffff7367c000,ffff75a2d000) => physical [fef367c000,fef5a2d000)
I1015 17:53:31.660868    5123 physical_map.go:176] physicalRegion: virtual [ffff75a2d000,ffff75aad000) => physical [fef5a2d000,fef5aad000)
I1015 17:53:31.660876    5123 physical_map.go:176] physicalRegion: virtual [ffff75aad000,ffff75aae000) => physical [fef5aad000,fef5aae000)
I1015 17:53:31.660883    5123 physical_map.go:176] physicalRegion: virtual [ffff75aae000,ffff95a3d000) => physical [fef5aae000,ff15a3d000)
I1015 17:53:31.660890    5123 physical_map.go:176] physicalRegion: virtual [ffff95a3d000,ffff95a3e000) => physical [ff15a3d000,ff15a3e000)
I1015 17:53:31.660897    5123 physical_map.go:176] physicalRegion: virtual [ffff95a3e000,ffff99a2f000) => physical [ff15a3e000,ff19a2f000)
I1015 17:53:31.660905    5123 physical_map.go:176] physicalRegion: virtual [ffff99a2f000,ffff99a30000) => physical [ff19a2f000,ff19a30000)
I1015 17:53:31.660912    5123 physical_map.go:176] physicalRegion: virtual [ffff99a30000,ffff9a22d000) => physical [ff19a30000,ff1a22d000)
I1015 17:53:31.660919    5123 physical_map.go:176] physicalRegion: virtual [ffff9a22d000,ffff9a22e000) => physical [ff1a22d000,ff1a22e000)
I1015 17:53:31.660926    5123 physical_map.go:176] physicalRegion: virtual [ffff9a22e000,ffff9a32d000) => physical [ff1a22e000,ff1a32d000)
I1015 17:53:31.660934    5123 physical_map.go:176] physicalRegion: virtual [ffff9a32d000,ffff9a38d000) => physical [ff1a32d000,ff1a38d000)
I1015 17:53:31.660941    5123 physical_map.go:176] physicalRegion: virtual [ffff9a38d000,ffff9a38f000) => physical [ff1a38d000,ff1a38f000)
I1015 17:53:31.660948    5123 physical_map.go:176] physicalRegion: virtual [ffff9a38f000,ffff9a390000) => physical [ff1a38f000,ff1a390000)
I1015 17:53:31.660955    5123 physical_map.go:176] physicalRegion: virtual [ffff9a390000,fffffffff000) => physical [ff1a390000,ff7ffff000)
root@gviosr-ci-arm64-01:~# cat /proc/5123/maps  | grep memfd:kvm_test
fe735de000-fef35de000 rw-s 00000000 00:01 5122                           /memfd:kvm_test_5123 (deleted)
root@gviosr-ci-arm64-01:~# strace -fp 5123 2>&1 | head -n 30
strace: Process 5123 attached with 6 threads
[pid  5128] futex(0x5d50d8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  5127] futex(0x40000f6950, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  5126] futex(0x4000180150, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  5125] futex(0x40000f6550, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  5124] restart_syscall(<... resuming interrupted io_setup ...> <unfinished ...>
[pid  5123] rt_sigtimedwait([CHLD], NULL, {tv_sec=0, tv_nsec=0}, 8) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
[pid  5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0
[pid  5123] ioctl(12, KVM_RUN, 0)       = -1 EFAULT (Bad address)
root@gviosr-ci-arm64-01:~# cat /sys/kernel/debug/tracing/trace_pipe  | head -n 50
        kvm_test-4936    [023] d... 34452.451795: kvm_timer_update_irq: VCPU: 32, IRQ 27, level 0
        kvm_test-4936    [023] .... 34452.451796: kvm_exit: TRAP: HSR_EC: 0x0024 (DABT_LOW), PC: 0x00000000001afca0
        kvm_test-4936    [023] .... 34452.451796: kvm_guest_fault: ipa 0xfee198c000, hsr 0x92000005, hxfar 0xfee198c000, pc 0x000000001afca0
        kvm_test-4936    [023] .... 34452.451797: kvm_get_timer_map: VCPU: 32, dv: 1, dp: 0, ep: -1
        kvm_test-4936    [023] d... 34452.451797: kvm_timer_save_state:    CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 1
        kvm_test-4936    [023] d... 34452.451797: kvm_timer_save_state:    CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 0
        kvm_test-4936    [023] .... 34452.451798: kvm_userspace_exit: reason error (14)
        kvm_test-4936    [023] .... 34452.451798: kvm_get_timer_map: VCPU: 32, dv: 1, dp: 0, ep: -1
        kvm_test-4936    [023] .... 34452.451798: kvm_timer_update_irq: VCPU: 32, IRQ 27, level 0
        kvm_test-4936    [023] .... 34452.451799: kvm_timer_update_irq: VCPU: 32, IRQ 30, level 0
        kvm_test-4936    [023] d... 34452.451799: kvm_timer_restore_state: CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 1
        kvm_test-4936    [023] d... 34452.451799: kvm_timer_restore_state: CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 0
        kvm_test-4936    [023] d... 34452.451799: kvm_arm_setup_debug: vcpu: 00000000f33fa138, flags: 0x00000000
        kvm_test-4936    [023] d... 34452.451800: kvm_arm_set_dreg32: MDCR_EL2: 0x00084e66
        kvm_test-4936    [023] d... 34452.451800: kvm_arm_set_dreg32: MDSCR_EL1: 0x00001000
        kvm_test-4936    [023] d... 34452.451800: kvm_entry: PC: 0x00000000001afca0
        kvm_test-4936    [023] d... 34452.451800: kvm_arm_clear_debug: flags: 0x00000000
        kvm_test-4936    [023] d... 34452.451800: kvm_timer_update_irq: VCPU: 32, IRQ 27, level 0
        kvm_test-4936    [023] .... 34452.451801: kvm_exit: TRAP: HSR_EC: 0x0024 (DABT_LOW), PC: 0x00000000001afca0
        kvm_test-4936    [023] .... 34452.451801: kvm_guest_fault: ipa 0xfee198c000, hsr 0x92000005, hxfar 0xfee198c000, pc 0x000000001afca0
        kvm_test-4936    [023] .... 34452.451802: kvm_get_timer_map: VCPU: 32, dv: 1, dp: 0, ep: -1
        kvm_test-4936    [023] d... 34452.451802: kvm_timer_save_state:    CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 1
        kvm_test-4936    [023] d... 34452.451802: kvm_timer_save_state:    CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 0
        kvm_test-4936    [023] .... 34452.451803: kvm_userspace_exit: reason error (14)
        kvm_test-4936    [023] .... 34452.451803: kvm_get_timer_map: VCPU: 32, dv: 1, dp: 0, ep: -1
        kvm_test-4936    [023] .... 34452.451803: kvm_timer_update_irq: VCPU: 32, IRQ 27, level 0
        kvm_test-4936    [023] .... 34452.451804: kvm_timer_update_irq: VCPU: 32, IRQ 30, level 0
        kvm_test-4936    [023] d... 34452.451804: kvm_timer_restore_state: CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 1
        kvm_test-4936    [023] d... 34452.451804: kvm_timer_restore_state: CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 0
        kvm_test-4936    [023] d... 34452.451804: kvm_arm_setup_debug: vcpu: 00000000f33fa138, flags: 0x00000000
        kvm_test-4936    [023] d... 34452.451804: kvm_arm_set_dreg32: MDCR_EL2: 0x00084e66
        kvm_test-4936    [023] d... 34452.451805: kvm_arm_set_dreg32: MDSCR_EL1: 0x00001000
        kvm_test-4936    [023] d... 34452.451805: kvm_entry: PC: 0x00000000001afca0
        kvm_test-4936    [023] d... 34452.451805: kvm_arm_clear_debug: flags: 0x00000000
        kvm_test-4936    [023] d... 34452.451805: kvm_timer_update_irq: VCPU: 32, IRQ 27, level 0
        kvm_test-4936    [023] .... 34452.451805: kvm_exit: TRAP: HSR_EC: 0x0024 (DABT_LOW), PC: 0x00000000001afca0
        kvm_test-4936    [023] .... 34452.451806: kvm_guest_fault: ipa 0xfee198c000, hsr 0x92000005, hxfar 0xfee198c000, pc 0x000000001afca0
        kvm_test-4936    [023] .... 34452.451807: kvm_get_timer_map: VCPU: 32, dv: 1, dp: 0, ep: -1
        kvm_test-4936    [023] d... 34452.451807: kvm_timer_save_state:    CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 1
        kvm_test-4936    [023] d... 34452.451807: kvm_timer_save_state:    CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 0
        kvm_test-4936    [023] .... 34452.451807: kvm_userspace_exit: reason error (14)
        kvm_test-4936    [023] .... 34452.451808: kvm_get_timer_map: VCPU: 32, dv: 1, dp: 0, ep: -1
        kvm_test-4936    [023] .... 34452.451808: kvm_timer_update_irq: VCPU: 32, IRQ 27, level 0
        kvm_test-4936    [023] .... 34452.451808: kvm_timer_update_irq: VCPU: 32, IRQ 30, level 0
        kvm_test-4936    [023] d... 34452.451809: kvm_timer_restore_state: CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 1
        kvm_test-4936    [023] d... 34452.451809: kvm_timer_restore_state: CTL: 0x000000 CVAL:              0x0 arch_timer_ctx_index: 0
        kvm_test-4936    [023] d... 34452.451809: kvm_arm_setup_debug: vcpu: 00000000f33fa138, flags: 0x00000000
        kvm_test-4936    [023] d... 34452.451809: kvm_arm_set_dreg32: MDCR_EL2: 0x00084e66
        kvm_test-4936    [023] d... 34452.451809: kvm_arm_set_dreg32: MDSCR_EL1: 0x00001000
        kvm_test-4936    [023] d... 34452.451810: kvm_entry: PC: 0x00000000001afca0

[pid 5123] ioctl(12, _IOC(_IOC_WRITE, 0xae, 0xa0, 0x40), 0x4000008ca0) = 0

    // Host must support ARM64_HAS_RAS_EXTN.
    if _, _, errno := unix.RawSyscall( // escapes: no.
            unix.SYS_IOCTL,
            uintptr(c.fd),
            _KVM_SET_VCPU_EVENTS,
            uintptr(unsafe.Pointer(vcpuSErrNMI))); errno != 0 {
            if errno == unix.EINVAL {
                    throw("No ARM64_HAS_RAS_EXTN feature in host.")
            }
            throw("nmi sErr injection failed")
    }

@zhlhahaha could you look at this issue? It think it is quite critical. We can see that nmi is queued in a loop. Is the NMI exception handler executed in this case? If the answer is yet, why doesn't it trigger an exit to the host.

I think I found the root cause of this issue. We queue an "NMI" interrupt, but it is blocked in the guest. With this patch avagin@ed5c754, the test passes...

CC: @lubinszARM
I am also looking into it.

@lubinszARM @zhlhahaha have you had a chance to look at this issue?

@lubinszARM @zhlhahaha have you had a chance to look at this issue?

Hi Avagin,
I had discussed this issue with Lubin several weeks ago, but have not got good solution for it. Sorry about delay reply on it. I will keep looking on this and put it in my task list.

A friendly reminder that this issue had no activity for 120 days.