rp-rs / pio-rs

Support crate for Raspberry Pi's PIO architecture.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Configuring PIOBuilder

JonasFocke01 opened this issue · comments

Hello!

I m asking this questione here, because i dont know where else. Please point me to the correct place, if this is wrong here!

I have a pio asm script, that should read DMX512 data from a dataline. The script works with C, as it is only copied from a C library (full credit to jostlowe). I want to do the same thing in Rust, but im currently failing to do so, and i suspect, that my PIOBuilder is not correctly configured to handle the incomming data.
This is the pio asm file im talking about:

; Author: Jostein Løwer, github: jostlowe
; SPDX-License-Identifier: BSD-3-Clause
;
; PIO program for reading the DMX lighting protocol.
; Compliant with ANSI E1.11-2008 (R2018)
; The program assumes a PIO clock frequency of exactly 500KHz

.program DmxInput
.define dmx_bit 4                     ; As DMX has a baudrate of 250.000kBaud, a single bit is 4us

break_reset:
    set x, 29                         ; Setup a counter to count the iterations on break_loop

break_loop:                           ; Break loop lasts for 8us. The entire break must be minimum 30*3us = 90us
    jmp pin break_reset               ; Go back to start if pin goes high during the break
    jmp x-- break_loop   [1]          ; Decrease the counter and go back to break loop if x>0 so that the break is not done
    wait 1 pin 0                      ; Stall until line goes high for the Mark-After-Break (MAB)

.wrap_target
    wait 0 pin 0                      ; Stall until start bit is asserted
    set x, 7             [dmx_bit]    ; Preload bit counter, then delay until halfway through

bitloop:
    in pins, 1                        ; Shift data bit into ISR
    jmp x-- bitloop      [dmx_bit-2]  ; Loop 8 times, each loop iteration is 4us
    wait 1 pin 0                      ; Wait for pin to go high for stop bits
    in null, 24                       ; Push 24 more bits into the ISR so that our one byte is at the position where the DMA expects it
    push                              ; Should probably do error checking on the stop bits some time in the future....

.wrap

The script to read all of that in Rust is quite simple, but, i assume, wrong:

use rp_pico as bsp;

use bsp::hal::clocks::{ClockSource, SystemClock};

use bsp::hal::pio::{
    PIOBuilder, PIOExt, PinDir, Running, Rx, ShiftDirection, StateMachine, StateMachineIndex,
    UninitStateMachine, PIO,
};

use crate::PIO_CLOCK_FREQ;

/// The main struct representing the DMX input hardware
pub struct DmxInput<P: PIOExt, SM: StateMachineIndex> {
    pub sm: StateMachine<(P, SM), Running>,
    pub rx: Rx<(P, SM)>,
}

impl<P, SM> DmxInput<P, SM>
where
    P: PIOExt,
    SM: StateMachineIndex,
{
    /// Create a new DMX instance. Returns `None` if there is not enough room to install the DMX
    /// input program in the PIO
    pub fn new(
        pio: &mut PIO<P>,
        sm: UninitStateMachine<(P, SM)>,
        pin_id: u8,
        system_clock: &SystemClock,
    ) -> Option<Self> {
        let uninstalled_program = pio_proc::pio_file!("src/dmx_input.pio").program;
        let program = pio.install(&uninstalled_program).ok()?;
        let system_clock_freq = system_clock.get_freq().to_Hz();

        let (mut stopped_sm, rx, _tx) = PIOBuilder::from_program(program)
            .side_set_pin_base(pin_id)
            .in_pin_base(pin_id)
            .jmp_pin(pin_id)
            .clock_divisor_fixed_point((system_clock_freq / PIO_CLOCK_FREQ) as u16, 0)
            .push_threshold(8)
            .autopush(false)
            .in_shift_direction(ShiftDirection::Right)
            .build(sm);

        stopped_sm.set_pindirs([(pin_id, PinDir::Input)]);
        let sm = stopped_sm.start();

        Some(Self { sm, rx })
    }

    pub fn read_blocking(&mut self, buf: &mut [u8]) {
        let mut buf = buf.iter_mut();
        self.sm.restart();
        while self.rx.is_empty() {}
        if let Some(channels) = self.rx.read() {
            for channel in channels.to_be_bytes() {
                if let Some(buf_channel) = buf.next() {
                    *buf_channel = channel;
                }
            }
        }
    }
}

And this is where i call all of that (striped down to the parts im assuming you need):

    let (mut pio, _, _, sm3, _) = pac.PIO1.split(&mut pac.RESETS);

    let Some(mut dmx) = DmxInput::new(&mut pio, sm3, dmx_pin_id, &clocks.system_clock) else {
        panic!("not able to create 'DmxInput'");
    };

    loop {
        let mut buf: [u8; 512] = [0; 512];
        dmx.read_blocking(&mut buf);
        serial.write(&[buf[0] + 48]);

Can you spott anything im doing wrong?

Hi, thank you for reaching out.
I have a few questions/comments before being able to pin point what's going wrong:

  • What's the state of the pin? I would recommend to add an extra member into your struct to keep the pin in use.
    You can define it as In: AnyPin<Function=P::PinFunction. Pull=PullUp>
  • You are missing a call to .buffers(Buffer::OnlyRX) but I don't think that'd be causing a big issue.
  • What behaviour are you expecting and where does it stop matching your expectations?
    eg: do you get a panic? does it run but does nothing? do you get unexpected output?

Thank you for answering!

  • I ensured, now that the pin in my test case is a pullup: let dmx_pin = pins.gpio0.into_pull_up_input();
  • I added the call to buffers(bsp::hal::pio::Buffers::OnlyRx)

The behavior is still this: I dont get a panic. When no signal via the dmx line is comming in, the function does what it should do, it blocks with while self.rx.is_empty() {}.
When there is a datastream incomming, it wont block, but the buffer stays at [0, 0, ...], when it really should reflect the incomming dmx512 data.

  • I ensured, now that the pin in my test case is a pullup: let dmx_pin = pins.gpio0.into_pull_up_input();

It should also be changed to the FunctionPIO function. Otherwise, the PIO block cannot control it.

Now, from what I understand from the pio program the bitloop will push each byte in a single word as 0x...._..bb (byte in LSB).
The function you define (fn read_blocking(&mut self, buf: &mut [u8])) seem to suggest that you expect to read multiple bytes.
But here's the code anotated with my understanding:

pub fn read_blocking(&mut self, buf: &mut [u8]) {
    let mut buf = buf.iter_mut();
    self.sm.restart();

    // waits for at least a byte to be pushed in the rx fifo
    while self.rx.is_empty() {}

    // reads 1 word (u32) from the fifo
    if let Some(channels) = self.rx.read() {

        // converts that word into a big endianned' array.
        // So this loop will iterate over [0, 0, 0, 0xbb].
        for channel in channels.to_be_bytes() {

            // This will fill the first 4 bytes with the content of the array returned by `to_le_bytes()`.
            // then the function returns
            if let Some(buf_channel) = buf.next() {
                *buf_channel = channel;
            }
        }
    }
}

If your intent is to replicate this feature then, it should probably be something like:

// `channels`' length must be 1 + the number of channels.
pub fn read_blocking(&mut self, channels: &mut [u8]) {
    // We need to deconstruct self because the state machine needs to be stopped.
    // sm could be made an Option<_> which would allow to extract it from Self, change its state/type,
    // and restore it and move it back to self at the end of the function.
    let Self { sm, rx } = *self;

    // stop state machine
    let stopped = sm.stop();

    // TODO: !! We don't have a way to restart/reset the state machine à la `pio_sm_restart(_pio, _sm);` :o
    // There is a function for that but it is currently private :S 

    // go back to the program's entry point.
    stopped.exec_instruction(Instruction {
        operands: InstructionOperands::Jmp {
            condition: JmpCondition::Always
            address: //TODO:  the address at which the program is loaded.
                     // It can be extracted from the installed program with `installed.offset()`.
        },
        delay: 0,
        side_set: None
    });

    // clear fifos
    stopped.clear_fifos();

    let sm = stopped.start();
    
    // for each expected channel
    for channel in channels.iter_mut() {
        // block until a byte is received & fill the channel with it.
        *channel = loop {
            if let Some(word) = rx.read() {
               break (word & 0xFF) as u8;
            }
        };
    }

    *self = Self { sm, rx };
}

I hope this helps.

Edit: I'm not sure this even compiles. I just wrote it here on github.

Thank you, for the suggestions!

  • The function of the Pin is now FunctionPIO1 (thats what im using)
  • You think correct, thats the feature i have in mind!
    I tried your code snipped. It seems like the pico is reading 'something' now. It reacts to the input, and even blocks, when no input is given.

But, what is read is not what is not comming. If there is a 128 in the first byte comming in, it will read 4 continously, if the input is 200, it reads 0, and so on for all other bytes. If i change the input to something and back to 200, it reads 0 again. So its sort of consistend there. This leads me thinking, that my clock is missconfigured? It should be correct with systemClock is at 12_000_000 and the divider 500_000. Is that correct?

Second problem now is, that there is allot of 'jitter'. Reading one byte continously, there is always a 0 or 255 or something flying around, when there really is not. E.g. it looks like this, where 4 is 'correct' 4, 4, 4, 4, 255, 4, 4, 4, 255, 255, 4, 4, 255, 4, 4.... Can you imagine why? The cables are correct. I always check them with an arduino. It reads correct!

This is the readfunction now (it even compiles :D )

    pub fn read_blocking_gh(self, channels: &mut [u8]) -> Self {
        let sm = self.sm.unwrap();
        let mut rx = self.rx;

        let mut stopped = sm.stop();

        // go back to the program's entry point.
        stopped.exec_instruction(Instruction {
            operands: InstructionOperands::JMP {
                condition: JmpCondition::Always,
                address: self.sm_program_offset,
            },
            delay: 0,
            side_set: None,
        });

        // clear fifos
        stopped.clear_fifos();

        let sm = stopped.start();

        // for each expected channel
        for channel in channels.iter_mut() {
            // block until a byte is received & fill the channel with it.
            *channel = loop {
                if let Some(word) = rx.read() {
                    break (word & 0xFF) as u8;
                }
            };
        }

        Self {
            sm: Some(sm),
            rx,
            sm_program_offset: self.sm_program_offset,
        }
    }

With regards to the clock speed, your initial snippet shows:

; The program assumes a PIO clock frequency of exactly 500KHz

but the c version seems to use a 1MHz pio clock see here and there.

For the other issue I don't know sorry.

YOO IT WORKS! 🎉 🎉 🎉

Thank you so much for pointing that out! How could i not have seen that?? I guess i was to overwhelmed by all that..
That was actually a huge win for me, Thank you a thousand times!


DISCLAIMER: If this is getting to much, please just close this issue, you helped me allot already!
I have one liiittle thingie i could ask, that would improve it.
Reading this blocking is aktually okay, but when the input source shuts down, this block indefinetly, and it also gets pretty slow (upwards of 60ms) for larger buffers. What buzzwords should i google up, or what would you do for this?

On MCUs there's commonly two major (but basic) ways to deal with that.

Here are a few in order of (subjective) complexity:

  • using nb and check on it regularly (you have an 8words/8bytes long fifo)
    it's close to what you are doing right now, except that instead of blocking, you only fill the buffer given in the argument as much as is currently available in the fifo and then return.
  • using an IRQ and having the IRQ signal when a packet or the buffer is complete.
    This is one level up. It does not give the highest performance as each bytes is still manually done by the core of the MCU, but the "when" it is read is controlled by the HW.
  • using a DMA (like the C version)
    Involves sharing a piece of memory with the HW. The compiler is unable to consider that and it's all on you to make sure everything remains memory safe.
  • using async.await API
    This gives IMHO the best UX in the end (it can be implemented all the previous methods) but requires extra considerations that the others don't.

Thank you very much, i will consider all of them! ❤️