jgarff / rpi_ws281x

Userspace Raspberry Pi PWM library for WS281X LEDs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Raspbery Pi 4 SPI DMA broken, produces zeros between bytes

Deadolus opened this issue · comments

I use a Raspberry PI 4, SPI on GPIO10.
It seems that the DMA-mode on (at least) Raspberry Pi 4 is broken.
On my research I repeatedly landed on: raspberrypi/linux#3570, so this might be related to this issue.

Faulty behaviour from user view:
LEDs don't produce correct color, can not write multiple LEDs at same time.
Writing a single LED with some color sometimes works.
Writing brightness 0xff never works, leds show strange colors.
This issue possibly causes: #482, #469 and similar issues

Expected Behaviour:
LEDs show proper value,
DMA mode works and produces correct output

Analysis:
The DMA mode on Raspberry Pi 4 generates a delay of one bit length, where it pulls the data line down.
I found the error by writing 9 bytes with value 0xff directly in to the DMA buffer.
This should produce one continous "high" line.
The actual output of the spi driver is: [d0,d1,d2,d3,d4,d5,d6,d7,zero,d8, ....]

image
So the DMA mode on Raspbery Pi 4 is broken and the library as is does not work for Raspberry Pi 4.

The problem lies within the spi driver, as it doesn't seem to respect the DMA mode and inserts zeros between bytes.
Possibly can be configured, but I have not found a solution in the driver ioctl interface.
A similar issue seems to be solved by setting DLEN directly in the broadcom spi registers:
https://forums.raspberrypi.com/viewtopic.php?t=181154
Possibly pigpio changes this flag and would work correctly.

Workaround:
We can work around this behaviour by a lucky coincidency in non-inverted SPI mode:
The symbols we use are always either "110" (bit is set) or "100" (bit not set).
Notice that both symbols end in zero.
The (faulty) output of the spi driver is: [d0,d1,d2,d3,d4,d5,d6,d7,zero,d8, ....]
d0-d2 are one symbol, e.g. 110
d3-d5 are another symbol
d6-d8 are the third symbol
As d8 always is zero we can just use the zero from the SPI mode and skip every 8th bit from being written in to
the dma buffer.
The change will produce the correct LED controlling output in this case.
This can be achieved by a simple patch:

diff --git a/ws2811.c b/ws2811.c
index 6482796..598217b 100644
--- a/ws2811.c
+++ b/ws2811.c
@@ -1200,6 +1200,9 @@ ws2811_return_t  ws2811_render(ws2811_t *ws2811)
                     {
                         uint32_t *wordptr = &((uint32_t *)pxl_raw)[wordpos];   // PWM & PCM
                         volatile uint8_t  *byteptr = &pxl_raw[bytepos];    // SPI
+                        if((bitpos == 7) && (l == 0)) {
+                          break;
+                        }
 
                         if (driver_mode == SPI)
                         {

I might have stomped on your logic here with the merge of #468 but I don't think it's impossible to achieve the same workaround.

This is the sort of thing I'd rather confirm as an upstream bug and get a fix for in the driver layer, than attempt to work around it.

Assuming only the Pi 4 has this bug we'd have to incorporate some way of managing this quirk for just one platform and since this library is used in multiple, board-agnostic language bindings it could not be a compile-time flag.

Does this behaviour still exist, has it been fixed in the intervening almost-a-year?

I believe I see the same bug on Raspberry Pi 3B+ Rev. 1.3. In my investigation with the oscilloscope I noticed that whenever I write 1 to the 3rd LS bit of any color, I get a erroneous LOW. Is it possible that the conversion table has some incorrect value that doesn't correspond to the color described?

Also, side note, can you please document what is inside this conversion table and how it is generated? Maybe take advantage of optimizations to compute it at compile-time so no mistakes are made there?

can you please document what is inside this conversion table

I poked for docs here, but I don't think they ever made it in #468 (comment)

Maybe take advantage of optimizations to compute it at compile-time so no mistakes are made there?

Happy to review a PR. This whole project is currently living and dying by contributions since I don't have the time for changes of this scope.

I honestly get that. I have no time either. I will try though. I think I understand the logic now. If I hit trouble I will follow up.

I think I understand the logic now

I'm still not totally sure I do, which is why I never added the comment 😬

I'm still not totally sure I do, which is why I never added the comment

If you refer the conversion table it is constructed as follows. For each value 0..255 expand its bits using the WS2811's bit pattern for each 0 and 1. This will be three bytes long, so each row of the 3x256 conversion table holds one of them, in the appropriate column offset. For example, the value 5, written in binary as 0b00000101, will be expanded to 0b100.100.100.100.100.110.100.110 (the dots separate the original bits) and split into bytes it is 0b10010010 (0x92), 0b01001001 (0x49) and 0b10100110 (0xA6), which as we can see are the values of conversion_table[0][5], conversion_table[1][5] and conversion_table[2][5].

What doesn't make any sense to me though is that the table is 3x256. It would be much more intuitive (at least for me) if the table was transposed to 256x3 because the individual bytes would be grouped in the code. I would like to see what the downsides are, if any.

By the way, is it possible (and OK with you) to establish a different channel of communication, so we don't pollute this thread and ping everyone registered? I also have some concerns on how the color is calculated. Thanks

By the way, is it possible (and OK with you) to establish a different channel of communication, so we don't pollute this thread and ping everyone registered? I also have some concerns on how the color is calculated. Thanks

I'm probably far from the best person to ask, since my involvement with the juicy internals is pretty minimal, but you can find me on Mastodon (https://fosstodon.org/@gadgetoid) or Discord (https://discord.gg/NRFKTtJPv) or just throw up another issue here.

So, I think I found what is going on, with my problem at least. Because the word size is 8 bits, when the 3rd LS bit is set, it happens to be exactly on the boundary of the word. This causes it to be deasserted for half a bit between words. And modular arithmetic confirms this is the only case this will happen. Way to drive you crazy. I am working on a pull request to see if I can fix this by increasing the word size to 24bits, but it will take some time because I lack an oscilloscope. I am also wondering if the fix will mess up the rest of the transmission on multi-led strands.