free5gc / gtp5g

GTP-U Linux Kernel Module

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bugs] Some GTP-U packet cause infinity loop in gtp5g

ShouheiNishi opened this issue · comments

Describe the bug

The GTP-U packet where the length of extention header is 0, the gtp5g is stucked.

To Reproduce

  1. This patch is applied to free5gc codes,
diff --git a/test/registration_test.go b/test/registration_test.go
index 19a75ef..26a57ca 100644
--- a/test/registration_test.go
+++ b/test/registration_test.go
@@ -230,7 +230,7 @@ func TestRegistration(t *testing.T) {

        // Send the dummy packet
        // ping IP(tunnel IP) from 10.60.0.2(127.0.0.1) to 10.60.0.20(127.0.0.8)
-       gtpHdr, err := hex.DecodeString("32ff00340000000100000000")
+       gtpHdr, err := hex.DecodeString("36ff003400000001000000ff00")
        assert.Nil(t, err)
        icmpData, err := hex.DecodeString("8c870d0000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637")
        assert.Nil(t, err)
  1. Start ./test.sh TestRegistration

Expected behavior

Don't stuck gtp5g

Environment (please complete the following information):

  • gtp5g Version: a9dc486
  • OS: Ubuntu 20.04
  • Kernel version: 5.4.0-147-generic

Trace File

[341332.419921] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [test.test:159349]
[341332.420101] Modules linked in: veth gtp5g(OE) udp_diag tcp_diag inet_diag udp_tunnel ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs sctp vmw_vsock_vmci_transport vsock nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua binfmt_misc intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit vmw_balloon input_leds joydev serio_raw rapl vmw_vmci mac_hid sch_fq_codel msr ramoops reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear vmwgfx ttm crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel aesni_intel syscopyarea sysfillrect sysimgblt crypto_simd fb_sys_fops cryptd drm psmouse vmxnet3 glue_helper vmw_pvscsi i2c_piix4 pata_acpi [last unloaded: gtp5g]
[341332.420144] CPU: 2 PID: 159349 Comm: test.test Tainted: G           OE     5.4.0-147-generic #164-Ubuntu
[341332.420145] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.18227214.B64.2106252220 06/25/2021
[341332.420152] RIP: 0010:gtp5g_encap_recv+0x156/0xef0 [gtp5g]
[341332.420154] Code: 95 00 44 39 e8 0f 82 b6 01 00 00 44 89 ea 41 80 7c 16 ff 00 0f 84 11 01 00 00 41 8b 7c 24 70 41 8b 74 24 74 41 8d 4d 01 89 f8 <29> f0 39 c1 76 c8 48 89 55 88 39 f9 0f 87 f2 03 00 00 29 fe 4c 89
[341332.420155] RSP: 0018:ffffb65f80108be0 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[341332.420156] RAX: 0000000000000061 RBX: ffff94aeb17cf8c0 RCX: 0000000000000015
[341332.420157] RDX: 0000000000000014 RSI: 0000000000000000 RDI: 0000000000000061
[341332.420157] RBP: ffffb65f80108c60 R08: 0000000000006808 R09: 0000000000000011
[341332.420158] R10: 0000000000000061 R11: 0000000000000000 R12: ffff94aeb1503700
[341332.420158] R13: 0000000000000014 R14: ffff94aeaa958c24 R15: 00000000000000ff
[341332.420159] FS:  00007f4959adb700(0000) GS:ffff94aeb7b00000(0000) knlGS:0000000000000000
[341332.420160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[341332.420161] CR2: 0000000001ef6310 CR3: 000000022b490001 CR4: 00000000007606e0
[341332.420190] PKRU: 55555554
[341332.420191] Call Trace:
[341332.420193]  <IRQ>
[341332.420198]  ? fib_validate_source+0x47/0xf0
[341332.420201]  ? check_urr+0x4d0/0x4d0 [gtp5g]
[341332.420203]  udp_queue_rcv_one_skb+0x1fc/0x520
[341332.420205]  udp_queue_rcv_skb+0x3f/0x1a0
[341332.420206]  udp_unicast_rcv_skb.isra.0+0x76/0x90
[341332.420207]  __udp4_lib_rcv+0x582/0xbe0
[341332.420216]  ? __wake_up_common+0x7e/0x140
[341332.420217]  udp_rcv+0x1a/0x20
[341332.420221]  ip_protocol_deliver_rcu+0xe9/0x1b0
[341332.420222]  ip_local_deliver_finish+0x48/0x50
[341332.420223]  ip_local_deliver+0x73/0xf0
[341332.420224]  ? ip_rcv_finish_core.isra.0+0x69/0x3b0
[341332.420225]  ip_rcv_finish+0x85/0xa0
[341332.420226]  ip_rcv+0xbc/0xd0
[341332.420227]  ? trigger_load_balance+0xad/0x210
[341332.420231]  __netif_receive_skb_one_core+0x88/0xa0
[341332.420232]  __netif_receive_skb+0x18/0x60
[341332.420233]  process_backlog+0xa9/0x160
[341332.420234]  net_rx_action+0x142/0x390
[341332.420240]  __do_softirq+0xd1/0x2c1
[341332.420241]  do_softirq_own_stack+0x2a/0x40
[341332.420242]  </IRQ>
[341332.420245]  do_softirq.part.0+0x46/0x50
[341332.420246]  __local_bh_enable_ip+0x50/0x60
[341332.420248]  ip_finish_output2+0x192/0x580
[341332.420249]  __ip_finish_output+0xf3/0x270
[341332.420250]  ip_finish_output+0x2d/0xb0
[341332.420251]  ip_output+0x75/0xf0
[341332.420252]  ? __ip_make_skb+0x31e/0x430
[341332.420253]  ip_local_out+0x3d/0x50
[341332.420254]  ip_send_skb+0x19/0x40
[341332.420255]  udp_send_skb.isra.0+0x161/0x380
[341332.420256]  udp_sendmsg+0xb09/0xd50
[341332.420257]  ? ip_reply_glue_bits+0x50/0x50
[341332.420261]  ? x2apic_send_IPI+0x4d/0x60
[341332.420264]  ? native_smp_send_reschedule+0x2a/0x40
[341332.420266]  ? ttwu_do_wakeup+0x1e/0x150
[341332.420268]  ? _cond_resched+0x19/0x30
[341332.420271]  ? aa_sk_perm+0x43/0x1b0
[341332.420273]  inet_sendmsg+0x65/0x70
[341332.420275]  ? security_socket_sendmsg+0x35/0x50
[341332.420276]  ? inet_sendmsg+0x65/0x70
[341332.420279]  sock_sendmsg+0x5e/0x70
[341332.420280]  sock_write_iter+0x93/0xf0
[341332.420286]  new_sync_write+0x125/0x1c0
[341332.420288]  __vfs_write+0x29/0x40
[341332.420289]  vfs_write+0xb9/0x1a0
[341332.420290]  ksys_write+0x67/0xe0
[341332.420291]  __x64_sys_write+0x1a/0x20
[341332.420293]  do_syscall_64+0x57/0x190
[341332.420294]  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[341332.420295] RIP: 0033:0x40484e
[341332.420297] Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[341332.420297] RSP: 002b:000000c000adf660 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[341332.420304] RAX: ffffffffffffffda RBX: 000000000000001f RCX: 000000000040484e
[341332.420304] RDX: 0000000000000059 RSI: 000000c0000ac800 RDI: 000000000000001f
[341332.420305] RBP: 000000c000adf6a0 R08: 0000000000000000 R09: 0000000000000000
[341332.420306] R10: 0000000000000000 R11: 0000000000000202 R12: 000000c000adf7e0
[341332.420306] R13: 0000000000000000 R14: 000000c0006036c0 R15: 000000c00005e800

@ShouheiNishi

How about the test result of #73 when using gtpHdr, err := hex.DecodeString("36ff003400000001000000ff") test data?

Test case 1: hex.DecodeString("36ff003400000001000000ff00")
#73 Result:

[ 5437.244362] upfgtp:[gtp5g] gtp5g_newlink: Registered a new 5G GTP interface
[ 5443.442281] upfgtp:[gtp5g] gtp1u_udp_encap_recv: Invalid extention header length
[ 5443.442283] upfgtp:[gtp5g] gtp5g_encap_recv: GTP packet has been dropped
[ 5454.547652] upfgtp:[gtp5g] gtp5g_dellink: De-registered 5G GTP interface

Test case 2: hex.DecodeString("36ff003400000001000000ff")
#73 Result:

[10773.933075] upfgtp:[gtp5g] gtp1u_udp_encap_recv: Failed to pull skb length 0x128
[10773.933079] upfgtp:[gtp5g] gtp5g_encap_recv: GTP packet has been dropped
[10785.058537] upfgtp:[gtp5g] gtp5g_dellink: De-registered 5G GTP interface

Thanks for the PR.

I apologize for the late reply due to the vacation.

Test case 1: hex.DecodeString("36ff003400000001000000ff00") #73 Result:

[ 5437.244362] upfgtp:[gtp5g] gtp5g_newlink: Registered a new 5G GTP interface
[ 5443.442281] upfgtp:[gtp5g] gtp1u_udp_encap_recv: Invalid extention header length
[ 5443.442283] upfgtp:[gtp5g] gtp5g_encap_recv: GTP packet has been dropped
[ 5454.547652] upfgtp:[gtp5g] gtp5g_dellink: De-registered 5G GTP interface

This result is expected behavior.

Test case 2: hex.DecodeString("36ff003400000001000000ff") #73 Result:

[10773.933075] upfgtp:[gtp5g] gtp1u_udp_encap_recv: Failed to pull skb length 0x128
[10773.933079] upfgtp:[gtp5g] gtp5g_encap_recv: GTP packet has been dropped
[10785.058537] upfgtp:[gtp5g] gtp5g_dellink: De-registered 5G GTP interface

In this case, first byte of inner IP header after GTP-U header whose value is 0x54 is treated as extension header length.
So this value is too large, pulling skb is failed.
I think that this result is expected behavior too.