awesometic / realtek-r8125-dkms

A DKMS package for easy use of Realtek r8125 driver, which supports 2.5 GbE.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Throughput monitoring issue

mschirrmeister opened this issue · comments

Hello,

I am running the latest version 9.011.00-NAPI with kernel 6.1 and it shows wrong values for the throughput. The nic is connected to a 1GBit switch.
With the kernels default driver r8169 the throughput monitoring tools show typically around 115MB/s. With the r8125 driver, it shows multiple hundred Gigabyte/s. It changes between 300-700 GB/s.

Driver

root@nightowl ~# ethtool -i ens4
driver: r8125
version: 9.011.00-NAPI
firmware-version:
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

pci device

root@nightowl ~# lspci -s 01:00.0 -k
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
	Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
	Kernel driver in use: r8125
	Kernel modules: r8169, r8125

Example wrong value.

  bwm-ng v0.6.3 (probing every 0.500s), press 'h' for help
  input: /proc/net/dev type: rate
  -         iface                   Rx                   Tx                Total
  ==============================================================================
               lo:           0.00  B/s            0.00  B/s            0.00  B/s
             ens4:         635.39 GB/s          546.15 KB/s          635.39 GB/s
             ens5:           0.00  B/s            0.00  B/s            0.00  B/s
  ------------------------------------------------------------------------------
            total:         635.39 GB/s          546.15 KB/s          635.39 GB/s

Any idea if I am doing something wrong, or is this a known issue?

Hello,

Can you try this Debian package?
I reverted some changes that came from this 9.011.00 version.

Please remove the .zip extension from the attached file, Github restricts uploading files.

realtek-r8125-dkms_9.011.00-2_amd64.deb.zip

The package does not install. Error is below.

DKMS make.log for realtek-r8125-9.011.00 for kernel 6.1.0-7-amd64 (amd64)
Wed Apr 12 08:50:18 AM CEST 2023
/bin/sh: 1: VER: not found
make -C src/ KVER=6.1.0-7-amd64 BASEDIR=/lib/modules/6.1.0-7-amd64 modules
make[1]: Entering directory '/var/lib/dkms/realtek-r8125/9.011.00/build/src'
make -C /lib/modules/6.1.0-7-amd64/build M=/var/lib/dkms/realtek-r8125/9.011.00/build/src modules
make[2]: Entering directory '/usr/src/linux-headers-6.1.0-7-amd64'
  CC [M]  /var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.o
  CC [M]  /var/lib/dkms/realtek-r8125/9.011.00/build/src/rtl_eeprom.o
  CC [M]  /var/lib/dkms/realtek-r8125/9.011.00/build/src/rtltool.o
/var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.c:13512:31: error: ‘rtl8125_get_stats’ undeclared here (not in a function); did you mean ‘rtl8125_get_stats64’?
13512 |         .ndo_get_stats      = rtl8125_get_stats,
      |                               ^~~~~~~~~~~~~~~~~
      |                               rtl8125_get_stats64
/var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.c:13468:1: warning: ‘rtl8125_get_stats64’ defined but not used [-Wunused-function]
13468 | rtl8125_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
      | ^~~~~~~~~~~~~~~~~~~
make[3]: *** [/usr/src/linux-headers-6.1.0-7-common/scripts/Makefile.build:255: /var/lib/dkms/realtek-r8125/9.011.00/build/src/r8125_n.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.1.0-7-common/Makefile:2037: /var/lib/dkms/realtek-r8125/9.011.00/build/src] Error 2
make[2]: Leaving directory '/usr/src/linux-headers-6.1.0-7-amd64'
make[1]: *** [Makefile:188: modules] Error 2
make[1]: Leaving directory '/var/lib/dkms/realtek-r8125/9.011.00/build/src'
make: *** [Makefile:42: modules] Error 2

I will be unavailable for the next 3 weeks. Can do the next test most likely first at the beginning of May.

Looks like the compiler options caused that.

Here is the new file: realtek-r8125-dkms_9.011.00-2_amd64.deb.zip

I checked it compiles normally. Sorry for the inconvenience 😅

Thanks. That one installs fine. But the problem is still there. It shows still GB/s. The number itself might be a little better. But still goes to high and fluctuates more compared to the r8169 driver.

Then we should check if it happens on the other kernel versions too.

I neutralize some conditions about kernel version 5.11.0 or above in the network stat things, which are not there in the previous version.
Maybe there is another point I should look at but anyway Realtek should know this error and release the new version if this error is also caused on another system.

Looks like it is to some extend kernel depended. I tested the following 2 kernels. Both have the problem as well.

  • 5.19.0-0.deb11.2-amd64
  • 5.15.94-x86

On 5.15.94-x86 it looks worse. Numbers go again up to 400GB/s.

I thought about reporting it to Realtek too, but did not find any good way to report it yet. Only thing I found is a support email address for network cards. I might drop a mail there and lets hope they can fix it.

Thank you for the test and for reporting it to Realtek. Let's wait for the new version.

The same problem and after running for a while, dmesg gives these errors:

[  993.729605] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B      OE      6.2.10-arch1-1 #1 3b64a9154b84a23b8badf9e10678249884a952c6
[  993.729609] Hardware name: Default string Default string/Default string, BIOS 1.010 09/27/2021
[  993.729611] ==================================================================
[ 1002.885181] ==================================================================
[ 1002.885190] BUG: KFENCE: use-after-free read in rtl8125_rx_interrupt+0x347/0x5c0 [r8125]

[ 1002.885207] Use-after-free read at 0x0000000024c7079d (in kfence-#222):
[ 1002.885210]  rtl8125_rx_interrupt+0x347/0x5c0 [r8125]
[ 1002.885221]  rtl8125_poll_msix_rx+0x45/0x90 [r8125]
[ 1002.885231]  __napi_poll+0x28/0x1b0
[ 1002.885238]  net_rx_action+0x2a2/0x360
[ 1002.885241]  __do_softirq+0xd1/0x2c8
[ 1002.885245]  __irq_exit_rcu+0xb7/0xe0
[ 1002.885250]  common_interrupt+0x86/0xa0
[ 1002.885252]  asm_common_interrupt+0x26/0x40
[ 1002.885257]  cpuidle_enter_state+0xe2/0x420
[ 1002.885261]  cpuidle_enter+0x2d/0x40
[ 1002.885263]  do_idle+0x1ed/0x270
[ 1002.885266]  cpu_startup_entry+0x1d/0x20
[ 1002.885269]  rest_init+0xc8/0xd0
[ 1002.885272]  arch_call_rest_init+0xe/0x30
[ 1002.885277]  start_kernel+0x734/0xb30
[ 1002.885280]  secondary_startup_64_no_verify+0xe5/0xeb

[ 1002.885286] kfence-#222: 0x000000001096ce9d-0x00000000efff5d14, size=232, cache=skbuff_head_cache

[ 1002.885289] allocated by task 412 on cpu 2 at 1002.503026s:
[ 1002.885295]  __alloc_skb+0x167/0x1d0
[ 1002.885299]  alloc_skb_with_frags+0x50/0x200
[ 1002.885301]  sock_alloc_send_pskb+0x203/0x250
[ 1002.885304]  __ip_append_data+0x998/0x1070
[ 1002.885308]  ip_make_skb+0x105/0x140
[ 1002.885310]  udp_sendmsg+0xacf/0xe90
[ 1002.885314]  udpv6_sendmsg+0x469/0x1050
[ 1002.885317]  sock_sendmsg+0x46/0x70
[ 1002.885319]  ____sys_sendmsg+0x17f/0x2f0
[ 1002.885321]  ___sys_sendmsg+0x9a/0xe0
[ 1002.885323]  __sys_sendmmsg+0xe3/0x210
[ 1002.885326]  __x64_sys_sendmmsg+0x21/0x30
[ 1002.885329]  do_syscall_64+0x5c/0x90
[ 1002.885332]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

[ 1002.885336] freed by task 0 on cpu 0 at 1002.885162s:
[ 1002.885372]  tcp_data_queue+0x5a6/0xec0
[ 1002.885375]  tcp_rcv_established+0x210/0x730
[ 1002.885378]  tcp_v6_do_rcv+0xde/0x4c0
[ 1002.885380]  tcp_v6_rcv+0xc88/0xd00
[ 1002.885383]  ip6_protocol_deliver_rcu+0x6c/0x480
[ 1002.885385]  ip6_input_finish+0x43/0x60
[ 1002.885386]  ip6_sublist_rcv_finish+0x59/0x90
[ 1002.885388]  ip6_sublist_rcv+0x22f/0x2f0
[ 1002.885390]  ipv6_list_rcv+0x13f/0x170
[ 1002.885392]  __netif_receive_skb_list_core+0x1f6/0x2c0
[ 1002.885395]  netif_receive_skb_list_internal+0x1d1/0x310
[ 1002.885398]  napi_gro_receive+0xd0/0x210
[ 1002.885400]  rtl8125_rx_interrupt+0x33d/0x5c0 [r8125]
[ 1002.885410]  rtl8125_poll_msix_rx+0x45/0x90 [r8125]
[ 1002.885420]  __napi_poll+0x28/0x1b0
[ 1002.885423]  net_rx_action+0x2a2/0x360
[ 1002.885426]  __do_softirq+0xd1/0x2c8
[ 1002.885428]  __irq_exit_rcu+0xb7/0xe0
[ 1002.885430]  common_interrupt+0x86/0xa0
[ 1002.885432]  asm_common_interrupt+0x26/0x40
[ 1002.885435]  cpuidle_enter_state+0xe2/0x420
[ 1002.885438]  cpuidle_enter+0x2d/0x40
[ 1002.885440]  do_idle+0x1ed/0x270
[ 1002.885442]  cpu_startup_entry+0x1d/0x20
[ 1002.885444]  rest_init+0xc8/0xd0
[ 1002.885447]  arch_call_rest_init+0xe/0x30
[ 1002.885450]  start_kernel+0x734/0xb30
[ 1002.885452]  secondary_startup_64_no_verify+0xe5/0xeb

@mschirrmeister
The official test version given by realtek, maybe you can try it, I'm not near the device and can't test it.
图片
r8125-9.011.01_20230412_b1.zip

@awesometic After a period of testing, the problem did not reproduce.

@dream10201

Thank you for your effort,

Can we merge that beta version into our repository? It will be open-sourced anyway, but don't know if we can use the unpublished version 🤔

@awesometic
Create a patch file and patch it before compiling. Maybe it would be better?

What @dream10201 posted here is also what Linda sent me for my question to Realtek. She mentioned that I can share the version here, because I am right now on vacation until early May. But @dream10201 shared the driver already. :-)

Linda also mentioned to me that they will apply the change in their next driver releases as well.

Fixed it by 9.011.01 version :)