Possible NULL pointer dereference in VC4 driver
PotatoMania opened this issue · comments
Describe the bug
It seems commit 63c0bcc introduced a bug, that when re-enabling DSI panels, the driver may fail due to NULL pointer dereference.
linux/drivers/gpu/drm/vc4/vc4_crtc.c
Lines 826 to 841 in dda85fd
Here vc4_encoder
may be NULL in some cases.
Steps to reproduce the behaviour
Running sway, run the following script to turn off/on the display, maybe a few times
swaymsg "output * power off"
swaymsg "output * power on"
Device (s)
Raspberry Pi CM3 Lite, Raspberry Pi CM4
System
- Not relevant.
- Not relevant.
- latest rpi-6.6.y branch(
dda85fda5b2dda7c4e2ba18770bd2033313006d2
)
Logs
When error occurs:
[ 77.177339] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000078
[ 77.177373] Mem abort info:
[ 77.177382] ESR = 0x0000000096000005
[ 77.177393] EC = 0x25: DABT (current EL), IL = 32 bits
[ 77.177406] SET = 0, FnV = 0
[ 77.177416] EA = 0, S1PTW = 0
[ 77.177425] FSC = 0x05: level 1 translation fault
[ 77.177436] Data abort info:
[ 77.177443] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 77.177453] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 77.177463] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 77.177475] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000005a7d000
[ 77.177488] [0000000000000078] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 77.177520] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[ 77.177535] Modules linked in: rfcomm cmac algif_hash aes_arm64 aes_generic algif_skcipher af_alg snd_seq_dummy snd_hrtimer snd_seq snd_seq_device joydev cdc_acm bnep brcmfmac_wcc brcmfmac brcmutil cfg80211 axp20x_battery axp20x_ac_power axp20x_adc industrialio axp20x_regulator axp20x_pek axp20x_i2c rtc_ds1307 axp20x regmap_i2c hci_uart btbcm bluetooth bcm2835_codec(C) bcm2835_isp(C) rpivid_hevc(C) bcm2835_v4l2(C) raspberrypi_hwmon bcm2835_mmal_vchiq(C) v4l2_mem2mem i2c_mux_pinctrl videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops i2c_brcmstb i2c_mux dwc2 videobuf2_v4l2 ecdh_generic videodev ecc rfkill libaes i2c_bcm2835 videobuf2_common raspberrypi_gpiomem vc_sm_cma(C) mc snd_bcm2835(C) simple_amplifier_switch(C) nvmem_rmem ocp8178_bl uio_pdrv_genirq uio sch_fq_codel crypto_user fuse dm_mod nfnetlink ip_tables x_tables ipv6 panel_clockwork_cwu50 vc4 snd_soc_hdmi_codec v3d snd_soc_core drm_shmem_helper snd_compress snd_pcm_dmaengine gpu_sched snd_pcm snd_timer snd drm_display_helper drm_dma_helper drm_kms_helper
[ 77.177994] cec drm drm_panel_orientation_quirks backlight
[ 77.178031] CPU: 1 PID: 382 Comm: sway Tainted: G C 6.6.30-1-uconsole-cm3-rpi64-gdda85fda5b2d #1
[ 77.178051] Hardware name: Raspberry Pi Compute Module 4S Rev 1.0 (DT)
[ 77.178061] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 77.178077] pc : vc4_disable_vblank+0x54/0xb0 [vc4]
[ 77.178195] lr : vc4_disable_vblank+0x50/0xb0 [vc4]
[ 77.178296] sp : ffffffc0812fb830
[ 77.178306] x29: ffffffc0812fb830 x28: 0000000000000003 x27: ffffff800415bc00
[ 77.178334] x26: ffffffd809354bc0 x25: ffffffd809161970 x24: ffffff8006848c00
[ 77.178360] x23: ffffff8004e27148 x22: 00000000000000c0 x21: ffffff8004e27000
[ 77.178386] x20: 0000000000000000 x19: ffffff8002a7a080 x18: ffffffc0812fb5e8
[ 77.178411] x17: 0000000000000000 x16: ffffffd8203281f8 x15: 0000000000000082
[ 77.178435] x14: 0000000000000000 x13: 000003dd561b52ee x12: 000f79c987158984
[ 77.178460] x11: 00000000fa83b2da x10: 0000000000001a90 x9 : ffffffd809027fec
[ 77.178486] x8 : ffffffd80905c000 x7 : 0000000000000000 x6 : 0000000000000000
[ 77.178510] x5 : 00000000000031bd x4 : 0000000000000000 x3 : 0000000000000001
[ 77.178534] x2 : ffffff8002b60000 x1 : 0000000000000000 x0 : 0000000000000001
[ 77.178558] Call trace:
[ 77.178567] vc4_disable_vblank+0x54/0xb0 [vc4]
[ 77.178668] drm_vblank_disable_and_save+0xc4/0x118 [drm]
[ 77.178987] drm_crtc_vblank_off+0xc8/0x288 [drm]
[ 77.179267] vc4_crtc_atomic_disable+0x8c/0xd8 [vc4]
[ 77.179367] disable_outputs+0x21c/0x340 [drm_kms_helper]
[ 77.179508] drm_atomic_helper_commit_modeset_disables+0x24/0x60 [drm_kms_helper]
[ 77.179634] vc4_atomic_commit_tail+0xe4/0x870 [vc4]
[ 77.179736] commit_tail+0xac/0x1a0 [drm_kms_helper]
[ 77.179862] drm_atomic_helper_commit+0x184/0x1a8 [drm_kms_helper]
[ 77.179968] drm_atomic_commit+0xb0/0xf0 [drm]
[ 77.180089] drm_mode_atomic_ioctl+0x988/0xc18 [drm]
[ 77.180200] drm_ioctl_kernel+0xd4/0x180 [drm]
[ 77.180312] drm_ioctl+0x220/0x4a8 [drm]
[ 77.180423] __arm64_sys_ioctl+0xb4/0x100
[ 77.180437] invoke_syscall+0x50/0x128
[ 77.180447] el0_svc_common.constprop.0+0x48/0xf0
[ 77.180454] do_el0_svc+0x24/0x38
[ 77.180462] el0_svc+0x48/0xf8
[ 77.180470] el0t_64_sync_handler+0x120/0x130
[ 77.180478] el0t_64_sync+0x190/0x198
[ 77.180487] Code: aa1503e0 b90037ff 97f462bd 36000140 (b9407a80)
[ 77.180493] ---[ end trace 0000000000000000 ]---
[ 77.180499] note: sway[382] exited with irqs disabled
[ 77.180533] note: sway[382] exited with preempt_count 3
Additional context
I changed if (vc4_encoder->type != VC4_ENCODER_TYPE_DSI0)
to if (vc4_encoder && vc4_encoder->type != VC4_ENCODER_TYPE_DSI0)
and the error never shows up. But I think it's not a desired behavior.
Thanks for the report, and you're totally correct that we need to handle the condition.
I had been stopping and restarting my display numerous times through kmstest
, but that obviously didn't hit the same situation you have.
I think the correct logic is actually
if (!vc4_encoder || vc4_encoder->type != VC4_ENCODER_TYPE_DSI0)
CRTC_WRITE(PV_INTEN, 0);
as if the crtc isn't connected to an encoder we want to disable the interrupts. It'll always get enabled again in vc4_enable_vblank
I'll give it a test and create a PR.
I tested the updated logic and the bug looks fixed to me :D
More context:
- the panel's controller is JD9365DA-H3 which uses 4 DSI lanes and the driver patch is here: https://github.com/PotatoMania/uconsole-cm3/blob/37362bbaa82c7306fe21149713e7229efc560269/PKGBUILDs/linux-uconsole-cm3-rpi64/0002-drm-panel-add-clockwork-cwu50.patch
- the panel requires
prepare_prev_first
to work - I bisected the VC4 driver changes between stable_20240423 and latest 6.6.y, and can confirm that the bug was introduced by 63c0bcc.