raspberrypi / utils

A collection of scripts and simple applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPIOLIB much slower than WiringPi on Pi 5

rafael2k opened this issue · comments

This is not a bug, but I'd like some light to understand why the very recent Pi 5 support in WiringPi for GPIO access is so much faster than GPIOLIB one. For eg., setting drive low and high in WiringPi reaches 20 MHz, while with GPIOLIB, not even 0.5 MHz.

Sample code of what I'm running to compare here:
https://github.com/rafael2k/Benchmark

I don't know for sure - I can tell you that gpiolib has not been optimised for speed - it wasn't really intended for bulk usage like that.

One explanation would be that the new WiringPi might be using the atomic SET/CLR alias feature to avoid the read/modify/write that rp1_gpio_set_drive does for every change.

Does this make a difference? It compiles, but I've not tested it.

diff --git a/pinctrl/gpiochip_rp1.c b/pinctrl/gpiochip_rp1.c
index 37ff92b..5594277 100644
--- a/pinctrl/gpiochip_rp1.c
+++ b/pinctrl/gpiochip_rp1.c
@@ -355,16 +355,13 @@ static int rp1_gpio_get_level(void *priv, unsigned gpio)
 static void rp1_gpio_set_drive(void *priv, unsigned gpio, GPIO_DRIVE_T drv)
 {
     volatile uint32_t *base = priv;
-    uint32_t reg;
     int bank, offset;
 
     rp1_gpio_get_bank(gpio, &bank, &offset);
-    reg = rp1_gpio_sys_rio_out_read(base, bank, offset);
     if (drv == DRIVE_HIGH)
-        reg |= (1U << offset);
+        rp1_gpio_sys_rio_out_write(base + RP1_SET_OFFSET/4, bank, offset, 1U << offset);
     else if (drv == DRIVE_LOW)
-        reg &= ~(1U << offset);
-    rp1_gpio_sys_rio_out_write(base, bank, offset, reg);
+        rp1_gpio_sys_rio_out_write(base + RP1_CLR_OFFSET/4, bank, offset, 1U << offset);
 }
 
 static void rp1_gpio_set_pull(void *priv, unsigned gpio, GPIO_PULL_T pull)

Wow, what a difference!! Thanks @pelwell!
Now I get:
240000000 toggle took 15.519 s, Time per toggle 0.065 us, Freq 15.465 MHz
basically, 30x faster.
: )

Cool - I may tidy it up a bit, but I'll make sure that logic becomes standard.

#74 is a slightly neater implementation. Would you mind testing it as well?

Sure. Will come back soon with results.

It keeps working, with approximately the same speed of the patch you posted.
240000000 toggle took 15.267 s, Time per toggle 0.064 us, Freq 15.721 MHz

Thanks - that's merged now.

Cheers!