ole / swift-rp-pico-bare

Embedded Swift on the Raspberry Pi Pico without the Pico C/C++ SDK

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CPU only runs with ~6.5 MHz because we aren't initializing the clocks

ole opened this issue · comments

I have only added the absolute minimum amount of post-boot init code that’s required to talk to the GPIOs so far. As a consequence, the CPU runs only at ~6.5 MHz, which is the frequency the RP2040 gets from its on-chip ring oscillator by default on boot.

To ramp up the clock speed to 125 MHz, we need to switch the system clock to the external crystal oscillator (XOSC) and PLLs.

You can sort of observe the slow clock speed when you flash the Pico without disconnecting it from power, e.g. via picoprobe/the RPi Debug Probe:

  1. Flash a Hello World/Blinky program to the Pico that correctly performs the on-boot initialization sequence. For example, one that comes with the Pico C SDK.
  2. Now, without disconnecting the Pico, flash our executable to the Pico.

Result: the LED will blink very fast because the correct program from step 1 performed the required steps to ramp up the CPU frequency, and this config survives a soft reset (not sure if it survives all types of resets, but it works this way with whatever type of reset probe_rs performs by default).

  1. Now disconnect the Pico from power and reconnect it. You should see the LED blink approximately 20× slower because our program doesn't perform the correct on-boot sequence, so the CPU only runs at ~6.5 MHz.

I'm working on a fix.

Relevant quotes from the RP2040 datasheet:

2.7. Boot Sequence

The Power-On State Machine (Section 2.13) is started. To summarise the sequence:

  • The Ring Oscillator (Section 2.17) is started, providing a clock source to the clock generators. clk_sys and clk_ref are now running at a relatively low frequency (typically 6.5MHz).

And:

2.17. Ring Oscillator (ROSC) 2.17.1. Overview

The Ring Oscillator (ROSC) is an on-chip oscillator built from a ring of inverters. It requires no external components and is started automatically during RP2040 power up. It provides the clock to the cores during boot. The frequency of the ROSC is programmable and it can directly provide a high speed clock to the cores, but the frequency varies with Process, Voltage and Temperature (PVT) so it cannot provide clocks for components which require an accurate frequency such as the RTC, USB and ADC. Methods for mitigating the frequency variation are discussed in Section 2.15 but these are only relevant to very low power design. For most applications requiring accurate clock frequencies it is recommended to switch to the XOSC and PLLs. During boot the ROSC runs at a nominal 6.5MHz and is guaranteed to be in the range 1.8MHz to 12MHz.

Once the chip has booted the programmer can choose to continue running from the ROSC and increase its frequency or start the Crystal Oscillator (XOSC) and PLLs. The ROSC can be disabled after the system clocks have been switched to the XOSC. Each oscillator has advantages and the programmer can switch between them to achieve the best solution for the application.

Turns out we can observe the different clock speeds on the oscilloscope. Let's write a tight infinite loop that toggles a GPIO pin as fast as possible:

while true {
    gpioSet(pin: led, high: true)
    gpioSet(pin: led, high: false)
}

Each call to gpioSet() is compiled into a single CPU instruction.

Hooking up the oscilloscope to the GPIO pin, we see this when running with the default system clock after boot, i.e. without ramping up the clock speed:

2024-02-23-Raspberry-Pi-Pico-Embedded-Swift-tight-GPIO-on-off-loop-without-clocks_init

Note the scale (100 ns per division), so each instruction (each switch from high to low and vice versa represents one instruction) takes ~200 ns = ~5 Mhz. This roughly fits the ~6.5 MHz spec.

Now the same code with the system clock properly initialized:

2024-02-23-Raspberry-Pi-Pico-Embedded-Swift-tight-GPIO-on-off-loop-with-correct-clocks_init

Note the different scale (10 ns per division). My oscilloscope isn't fast enough to see the rectangle signal, but we can discern that it takes approximately 10 ns per instruction = ~100 MHz, qed.

Interesting tidbit: the ripple on the left side of the graph is the jump back to the start of the loop, which takes longer because branches take 2 cycles on arm6m (I think). Naively, we would expect on such ripple after each high/low cycle because that's how we wrote the code. But it turns out the compiler is unrolling our infinite loop, putting 4 high/low cycles into a single loop iterations. Nice!