MCU land #11: Linux toolchain for Cortex-M
The Sisyphean task of cross-compiling binaries for 32-bit ARM microcontrollers.
About a month ago, I covered the development process for a handheld game dubbed Bob the Cat and the Blocks of Doom. As a matter of practicality, I did most of the coding in Microchip Studio, a proprietary Windows-only IDE provided by the manufacturer of the MCU. That said, in the instructions released with the source code, I wanted to explain how to build the game on Linux using open-source tools.
I soon found out that while the tooling for popular 8-bit AVR and PIC microcontrollers is well-documented and easy to use, the situation for 32-bit ARM Cortex-M chips is rather bleak — so for the benefit of internet strangers searching for advice, I decided to jot down some hints. Most of it is device-agnostic, but there are several vendor-specific bits. In other words, if you’re using an MCU from a manufacturer other than Microchip, my write-up should still put you on the right track, but be prepared for a bit of extra homework.
Step 1: get the Cortex-M cross-compiler
To get the ball rolling, we need to procure a special compiler (and some companion tools) to generate code for ARM microcontrollers. This toolchain is distinct from what’s needed for “normal” ARM processors (the Cortex-A family).
Most Linux distributions have the the relevant packages available for installation; look for gcc-arm-none-eabi and binutils-arm-none-eabi. Sometimes, the naming might be partly reversed — arm-none-eabi-gcc and arm-none-eabi-binutils. In the absence of suitable packages, it’s also possible to download a statically-linked toolchain directly from ARM (link).
This special compiler works much like normal gcc. Because of the multitude of Cortex-M chips with varying capabilities, you should always explicitly tell it about the target platform; for Bob the Cat, which is using the ATSAMS70J21B Cortex-M7 MCU, the appropriate compiler flag is “-mcpu=cortex-m7”.
So, are we done? Not a chance!
Step 2: download the CMSIS library
The ARM-provided CMSIS library offers a handful of basic abstractions for the Cortex-M core. This includes human-readable names and addresses for a number of important memory-mapped registers, along with simple macros and functions to spare you the effort of putting assembly instructions directly in your code. A representative example might be:
#define __WFI() __ASM volatile ("wfi":::"memory")
Although CMSIS is not strictly required, it simplifies your life, and is a dependency for some of the more useful components discussed later on. Alas, the library is not offered by most Linux distributions — but it can be downloaded straight from ARM by following this link.
The downloaded file will have a cryptic extension (.pack), but it’s simply a ZIP archive. It suffices to unpack it and then place the CMSIS/Core/Include/ subdirectory in the compiler’s search path (“-I<path>”).
But we’re still nowhere near having a functioning build environment!
Step 3: get the Device Family Pack (or equivalent)
ARM microcontrollers consist of an ARM-licensed core surrounded by a variety of on-die peripherals designed by the manufacturer of the MCU. These peripherals range from clock controllers, to DMA subsystems, to DACs and ADCs. In essence, they implement most of what isn’t pure calculation — and they play a vital role in all I/O.
As a general rule, all these peripherals are controlled through memory-mapped registers, but the names, locations, and semantics of the registers will vary from one chip family to another. Each vendor has their own collection of .h files defining all this for every product in their lineup. In the case of Microchip, the collections are known as a Device Family Packs (DFPs); STM and NXP call them "Hardware Abstraction Layers” (HAL-LL and KSDK HAL, respectively). The Microchip DFP corresponding to your device can be downloaded here; as with CMSIS, the file has an oddball extension, but is a regular ZIP archive.
To use the DFP, it suffices to unpack it and then point the compiler to the appropriate include file location via the “-I<path>” option. For ATSAMS70J21B, the correct include subdirectory is sams70b/include/. The exact chip in use must be also identified through a preprocessor directive; the Microchip convention is to prepend “-D__ATSAMS70J21B__” to the command line.
But wait — there’s more.
Step 4: find the bootstrap shim and the linker script
The vendor-supplied archive should also contain a device-specific bootstrap shim that supplies the default interrupt table and intercepts the reset interrupt that is generated on power-on. The reset handler does a bit of housekeeping before passing the control over to main(). In the case of ATSAMS70J21B, the bootstrap shim can be found in the DFP archive at sams70b/gcc/gcc/startup_sams70j21b.c.
The file needs to be incorporated into the build, but doesn’t require any special handling beyond that. It can be passed to gcc alongside with any .c files that comprise the body of your project, and the ordering is irrelevant.
The other important helper file in the vendor-supplied archive is the linker script; for the aforementioned SAM S70 chip, this script can be found at sams70b/gcc/gcc/sams70j21b_flash.ld. This code enforces the correct memory layout — notably for the interrupt vector table — and tells the linker where the quasi-read-only flash memory ends and where SRAM begins. The location of the script is passed via the “-T<path>” parameter.
Any more hops to jump through? You bet!
Step 5: procure a not-so-standard C library
Bare-metal software can’t use your regular standard C library (libc). After all, without an operating system, there is no backend for library calls such as open(), fork(), or connect(). That said, there’s still a fair number of non-OS-dependent libcalls that are possible to implement on an MCU; examples include pow(), strlen(), or memcpy(). It follows that some sort of a non-standard “standard” library would be nice to have.
The de facto standard for Cortex-M development is a project called newlib, a minimalistic framework originally developed by Cygnus Solutions and now maintained by Red Hat. It is available on most Linux distros under the name of newlib-arm-none-eabi (or some variation thereof). The sources can be also downloaded directly from the project page, although building the library from scratch is outside the scope of this post.
If your distribution maintainers did a good job, there should be no extra steps required for the cross-compiler to find and use the library once you install the package; it is, however, important to append two additional command-line parameters to make sure that only the right portions of the library are pulled into your build. You do that by specifying “-ffunction-sections -Wl,--gc-sections”.
Step 6: actually compile the thing
With the build environment cobbled together, the only essential #include directive for Microchip MCUs is <sam.h>. You should also list the usual standard library headers for the types and functions you want to use. Common necessities are <stdint.h>, <string.h>, and <math.h>.
The basic compiler invocation for a simple program might look like this:
gcc-arm-none-eabi -D__ATSAMS70J21B__ -mlong-calls -mcpu=cortex-m7 -ffunction-sections -Wl,--gc-sections -T "<DFP_path>/sams70b/gcc/gcc/sams70q21b_flash.ld" -I "<CMSIS_path>/CMSIS/Core/Include/" -I "<DFP_path>/sams70b/include/" "<DFP_path>/sams70b/gcc/gcc/startup_sams70j21b.c" your_program.c -o your_program.elf
Other useful options might include “-mfloat-abi=hard -mfpu=fpv5-d16” if your code wants to make use of the on-die floating-point unit; or “-Wl,--defsym,STACK_SIZE=<num>” if you wish to override the meager default 1 kB stack size specified in the SAM S70 linker script. Most of the common gcc options are fair game too; you can add “-Wall” or “-O3” to taste. For math functions, specify “-lm”.
If all goes well, you should end up with your_program.elf in the current directory. But we’re not quite done yet!
Step 7: program the device
The final hurdle is flashing the resulting binary onto the chip — and this task stumped me for a good while. Assuming you already have a USB-connected dongle that supports the CMSIS-DAP protocol, we still need to find a Linux application to control the dongle with.
(If you don’t have a suitable CMSIS-DAP programmer, examples include Atmel-ICE, PICkit 5, and MPLAB SNAP, along with countless lower-cost clones.)
The best choice appears to be openocd, but the tool has fairly obtuse syntax and relevant usage examples are few and far between. In the end, the command to send the firmware to any SAM S70 / E70 / V7x chip is fairly simple:
openocd -f scripts/interface/cmsis-dap.cfg -f scripts/target/atsamv.cfg -c "program your_program.elf; reset; exit"
You might change the paths depending on where the scripts/ subdirectory for openocd is installed on your system. For other chip families, the atsamv.cfg target would need to be substituted, too, keeping in mind that the openocd naming scheme is not always clear (case in point: they use “atsamv” for SAM S70).
The added complication for this series of chips is that by default, the device boots from ROM, not flash. To toggle the boot mode, it’s necessary to set the right bit in the chip’s non-volatile configuration register (GPNVM). This register is akin to fuses on 8-bit AVR MCUs, and the magic command is:
openocd -f scripts/interface/cmsis-dap.cfg -f scripts/target/atsamv.cfg -c "init; halt; atsamv gpnvm set 1; resume; exit"
Well, that’s it! The environment is painful to set up, but once you have it going, the build-and-program process is easy to automate.
If you liked this article, please subscribe! Unlike most other social media, Substack is not a walled garden and not an addictive doomscrolling experience. It’s just a way to stay in touch with the writers you like.
For more MCU- and electronics-themed articles, click here.
Also, CMSIS is an essential library. It includes a wide range of optimised DSP, math, matrix and statistical functions. CMSIS includes RTOS capabilities, but most people prefer to use a realtime executive like FreeRTOS or Apache NUTTX.
Eh?? No mention of gdb and how to use it with openocd for debugging.