blackmagic-debug / blackmagic

In application debugger for ARM Cortex microcontrollers.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ARM gcc 13.2.rel1 compatibility

jelson opened this issue · comments

ARM released their latest gcc-based toolchain, rev 13.2.1, about a week ago, on 2023-10-30. Compiling BMP firmware with it seems to cause a flash overflow as seen below, at HEAD of main as of today (d0c8bb7). The same commit compiles without issue using the prior ARM gcc release, 12.2.1.

$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (Arm GNU Toolchain 13.2.rel1 (Build arm-13.7)) 13.2.1 20231009
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ make
  BUILD   lib/stm32/f1
  BUILD   lib/stm32/f4
  BUILD   lib/lm4f
make[2]: Nothing to be done for 'all'.
make[2]: Nothing to be done for 'all'.
make[2]: Nothing to be done for 'all'.
  LD      blackmagic.elf
/usr/local/arm-gnu-toolchain-13.2.Rel1-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/13.2.1/../../../../arm-none-eabi/bin/ld: blackmagic.elf section `.data' will not fit in region `rom'
/usr/local/arm-gnu-toolchain-13.2.Rel1-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/13.2.1/../../../../arm-none-eabi/bin/ld: region `rom' overflowed by 248 bytes
Memory region         Used Size  Region Size  %age Used
             rom:      131320 B       128 KB    100.19%
             ram:        3904 B        20 KB     19.06%
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:152: blackmagic.elf] Error 1
make: *** [Makefile:17: all] Error 2

Yep, this is a known issue and there's not much we can do about it right now. Part of the v2.0 roadmap includes a complete rewrite of the build system which is what we're currently preparing to start in on. This will allow you to better tune the build for your needs and the available Flash. Best we can offer is to use the 12.2.Rel1 toolchain still and sit tight for the replacement build systems to land.

You can modify the build by editing src/Makefile and removing targets from the SRC list (by deleting the line, comments do not work how you'd want them to in this case) if you have to use the GCC 13 toolchain. For example, if you're not using the EFM32 support, delete the line with efm32.c on it to claw back the space it consumes.

Thanks; I'm glad it's already known. I'm not blocked (I used gcc 12) but wanted to flag it just in case you weren't already aware. Feel free to close this if it's redundant.

(I am curious, though: why did this compiler increase the size of the binary so significantly? Usually it goes the other direction.)

As best as we've been able to tell, GCC 13 changes the way some optimisations apply resulting in slightly larger generated code - which as we're sufficiently close to filling the Flash already, pushes it over the edge as you've seen.

GCC 12 improved the codegen over the previous compiler generations while also providing a nice sweet spot on how the optimisations available play with respect to code size. It might be interesting, given this, to see what happens if giving -Oz instead of -Os to GCC 13.

Interesting; I'll have to check how gcc13 binary size impacts some of my other projects where I'm flash-constrained.

A little off-topic but I'm curious what the end-game is here: if the BMP is really out of space, is the plan to ship devices that don't support all targets, and ask customers to customize their firmware? Or will future BMPs be sold with a part that has larger flash?

The overview of the idea is that there will be several standard configurations of firmware - eg, for people wanting to debug STM32's vs LPC's, and they can then use bmputil to easily control which firmware their probe is running (w/ that tool being significantly expanded to make downloading variants and uploading them to the probe easier).

Plus that if they want to be able to debug anything without firmware switches, they can then pull a minimal firmware base that implements the debug protocols and remote protocol but no target support, and then use BMDA to run the show w/ BMDA always being a complete build of the entire project's target and architectures support.

I think there's a little more to this issue than meets the eye.

Build host: macOS

Compile options: make PROBE_HOST=native ENABLE_RTT=1 "RTT_IDENT=BlackMagicProbe"

v1.9.2 with arm-gnu-toolchain-12.3.rel1 compiles fine.

However, v1.10.1 fails with "out of space" for both arm-gnu-toolchain-12.3.rel1 and arm-gnu-toolchain-13.2.Rel1

Example errors:

  LD      blackmagic.elf
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-13.2.Rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/13.2.1/../../../../arm-none-eabi/bin/ld: address 0x8021848 of blackmagic.elf section `.text' is not within region `rom'
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-13.2.Rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/13.2.1/../../../../arm-none-eabi/bin/ld: blackmagic.elf section `.ARM.exidx' will not fit in region `rom'
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-13.2.Rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/13.2.1/../../../../arm-none-eabi/bin/ld: address 0x8021848 of blackmagic.elf section `.text' is not within region `rom'
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-13.2.Rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/13.2.1/../../../../arm-none-eabi/bin/ld: region `rom' overflowed by 6796 bytes
Memory region         Used Size  Region Size  %age Used
             rom:      137868 B       128 KB    105.18%
             ram:        5704 B        20 KB     27.85%
collect2: error: ld returned 1 exit status
make[1]: *** [blackmagic.elf] Error 1
make: *** [all] Error 2
(base) [andrew@Babuji blackmagic ((v1.10.1))]$ 

and

  LD      blackmagic.elf
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-12.3.rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/12.3.1/../../../../arm-none-eabi/bin/ld: address 0x8021a0c of blackmagic.elf section `.text' is not within region `rom'
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-12.3.rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/12.3.1/../../../../arm-none-eabi/bin/ld: blackmagic.elf section `.ARM.exidx' will not fit in region `rom'
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-12.3.rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/12.3.1/../../../../arm-none-eabi/bin/ld: address 0x8021a0c of blackmagic.elf section `.text' is not within region `rom'
/Users/andrew/Developer/Toolchains/arm-gnu-toolchain-12.3.rel1-darwin-x86_64-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/12.3.1/../../../../arm-none-eabi/bin/ld: region `rom' overflowed by 7248 bytes
Memory region         Used Size  Region Size  %age Used
             rom:      138320 B       128 KB    105.53%
             ram:        5704 B        20 KB     27.85%
collect2: error: ld returned 1 exit status
make[1]: *** [blackmagic.elf] Error 1
make: *** [all] Error 2
(base) [andrew@Babuji blackmagic ((v1.10.1))]$ 

The main difference between these builds and "stock" is ENABLE_RTT=1, I think...

This is why we don't enable RTT by default - as yes, we don't have the space for it with stock options right now. if you rebuild both with RTT off, you should get the same result as the OP for GCC 13 from what we know, and you should get a completely working build from GCC 12.

When doing these comparisons, assume that: RTT is off, debug is off, and the only options on the make line are: make PROBE_HOST=native. Likewise, you can assume that with RTT builds, to make them comparable, the identifier is stock too.

Oh oh, I see... thank you for the explanation, @dragonmux!

That's really too bad, because RTT is ... well, kinda' magical when it comes to realtime debugging where you can't stop the system, and things like UARTs and SWO are too slow.

It's our hope that with v2.0 with the new build systems, we will be able to enable RTT by default because of this.

However, one note - RTT is not quite the magical nirvana you might think it is in one important way: It cannot be used for non-stop read on: any Cortex-A or -R parts, RISC-V parts using the memory access abstract command, RISC-V parts using the program buffer, and some Cortex-M parts with APs that will quite literally crash debug when attempting access with the core running.

As long as the part you want to use RTT with does not fall into one of those problem groups, then yes, it is very useful. Do mind that in some cases with Cortex-M parts that do work, that it is stealing bus cycles so may cause issues in some hard real-time cases. In the case of that happening, SWO or full parallel trace may be your only option, and is your only option for Cortex-A/R parts needing full hard real-time debug.

It's our hope that with v2.0 with the new build systems, we will be able to enable RTT by default because of this.

That's awesome news, thank you!

RTT is not quite the magical nirvana you might think it is

Of course, you're completely correct. I have had the (mis)fortune of working on a lot of projects where the only debugging facilities were character-by-character over a slow UART. 🤢 Compared to that, RTT is indeed magical!

(And I've been lucky enough to only really need RTT on Nordic M4F and STM H7 parts, both of which behave rather nicely with the AHB-AP!)

For others coming up on this thread, Segger has an application note regarding the Cortex-A/R issue mentioned above.

RISC-V parts

Ooo! Thanks for that information, @dragonmux! RISC-V is one of those things that I "really want to learn about someday", but... I'm rather heavily invested in the ARM ecosystem at the time. One day, one day...