Home
This course is intended to be a starting off point into embedded systems. If you are just starting, the first chapter will walk you through your environment setup and provide a supply list with links to an online supplier. You are reading a book hosted on GitHub Pages, the source code and examples are hosted here.
A few common questions:
Is this for me?
- This course is for people that already have experience programming that want an overview of embedded development.
- This course is not for people trying to explicitly learn Rust, it is an embedded course that just so happens to use Rust.
Are we going to be covering X feature?
- This course is supposed to be as simple as possible so likely not, however, I am more than willing to explore anything that has sufficient interest.
How should I edit this code?
- Use whatever editor you are comfortable with. If you do not have a preference, VSCode is a great place to start.
- Ensure that whatever editor you choose can utilize the Rust Analyzer LSP.
How should I follow along the exercises?
- I suggest that you avoid pulling the repository and just using the code as is. Walking through the chapters should give you enough information to program most of the functionality yourself.
- It is however, totally fine to pull the repository to take advantage of the configuration that has been done for different IDEs and the probe.
- The completed code along with all the required configuration is in the repo if you get stuck on anything in the course.
Why Rust?
- The Rust compiler's strict nature makes whole classes of errors difficult or impossible to come across. This particularly benefits students and empowers professionals alike.
- The dependency ecosystem of Rust, while immature compared to C, is amazingly easy to use compared to some of the development environments that hardware manufacturers put together. I wanted to be able to step up the ladder of abstraction in a way that not many environments allow for.
- I like Rust, and I am writing this, so I will use what I want.
Environment setup
Supply List
I have split this list into two categories: the essential items are the bare minimum to get anything running using this repository and the nonessentials are nice-to-haves that give you the leeway to continue tinkering on your own after this course is complete.
Essential Supplies
- Raspberry Pi Pico development board with headers - Mouser # 358-SC0917
- The board with included headers is worth the added expense because it has a socket that directly plugs into the debug probe without any soldering required.
- Raspberry Pi Debug Probe - Mouser # 358-SC0889
- There are ways to get code to flash to a Raspberry Pi Pico without a debug probe but having one in your toolbox is great.
Nonessential Supplies
- Breadboard power supply - Mouser # 474-PRT-21297
- Barrel jack adapter for power - Mouser # 474-TOL-15313
- Breadboards - Mouser # 589-TW-E40-1020
- Note the variation in quality for breadboards is high, these will likely be fine but if you want the best of the best look here, they are the supplier for 3M which are widely accepted as being the best breadboards around, but very expensive.
- Breadboard jumper wire kit - Mouser # 474-PRT-00124
Development Environment
OS
- Most modern Linux distros will be fine for this. I used a freshly downloaded Debian 12 VM.
- WSL has a hard time interfacing with hardware devices; a VM is better for this project.
- MacOS generally mirrors Linux using the HomeBrew package manager.
- Windows is untested, attempt at your peril.
Rust
- If you don't have Rust installed already you will need to install
rustup
, the Rust toolchain manager. - Follow the install directions here
- Once complete you should be able to use the
cargo
andrustup
commands
Probe-rs
- Probe-rs is a collection of tools to aid with embedded development in Rust, it includes software to interface with debug probes and integrations with Rust's build system, Cargo.
- Install probe-rs following the websites directions:
- Main site
- Installation page
- 🚨 If you are on Linux ensure you add a udev rule to allow non-root users to access debug probes, directions here
IDE
- See the official Rust tooling page
- For people new to Rust or if you have no real preference, I would recommend VSCode or Neovim with the rust-analyzer LSP.
Setup verification
Raspberry Pi Pico
Before we move on to trying to program a board, we should always check that the manufacturer examples work. This will cut down on the scope of troubleshooting needed when something breaks.
- Follow the guide here
- This guide will walk you through loading a blinking light firmware on to your board through its native USB interface.
- The debug probe is not required for this step.
Debug Probe Flashing
Now that we know the Pi Pico is running, it is time to push some example code through the debug probe interface.
- Plug in your debug probe to the Pi Pico
- Note that this connection does not supply power to the Pico, either plug it in via USB or an external power supply.
- Regardless of the method you choose to power the board, light should still be blinking from the last section.
- Pull this repository to your development machine
- Install some build tools:
rustup target add thumbv6m-none-eabi
cargo install flip-link
sudo apt install gcc
- 🚨 Ensure you have configured the UDEV rules mentioned here
- Run
cargo run
- You should see a successful code flash and the print statements from the debug interface being printed to your console
Booting a Microcontroller
Creating the Binary
In the README, we learned about flashing code onto the microcontroller using a debug probe. That is great, but how do we know that our code is going to run on the selected processor? When a program is compiled for as an application on a PC, it will will most likely use a 64 bit architecture (either x64 or ARM). Normally the architecture is selected for the programmer and it is not something that must be considered, tools generally assume that you are building for the system that you are currently using. However, when programming a microcontroller, the architecture will not be the same. Using a significantly simplified architecture comes with unique challenges but also benefits in power efficiency. This chapter will outline finding the specific instruction set for a microcontroller and the configuration of a development environment to compile to it along with the consequences of a resource limited system.
The no_std
and no_main
Environment
The Rust standard library or just std
contains many useful tools for everyday development. However, it is difficult to implement all those tools on limited systems. Generally, the starting point for embedded Rust is in the no_std
environment. By default, Rust includes the standard library in program files, meaning you can just use Vec::new()
instead of having to fully specify std::vec::Vec::new()
or use a use
statement. This behavior is disabled by using the #[no_std]
attribute in the main.rs
file.
Note, you can still use standard library functions in the no_std
environment, however, you will have to explicitly include the desired code and manually configure certain behavior. For example, to use types like String
or Vec
requires that heap allocation is configured using a global allocator. In other words, no_std
code has no heap allocation as a starting point.
In addition to having no standard library, there is also no provision for a main
function. Instead, we need to properly place our entry
function in memory so that the CPU begins execution at the beginning of our program. Moreover, the processor needs certain things to be done before we can start executing general-purpose code, such as setting up external flash memory and initializing registers. By default, Rust configures a main
function for the programmer for whatever target is selected, however, this is not the case for embedded systems. This behavior is disabled by using the #[no_main]
attribute in the main.rs
file.
If you are curious about what a no_main
Rust file looks like in a familiar environment, take a look at this example of a Rust program written without a main
function for a UNIX system.
The first two lines of our minimal program will be just the two attributes discussed above:
src/main.rs
#![allow(unused)] #![no_std] #![no_main] fn main() { }
Cross Compilation
In Rust, we specify a cross-compilation target via a "target triple." The target triple is a string that fully defines what the output byte code should look like for whatever processor we select. The Rust platform support page has an index of the available targets with their associated level of support.
Target Selection
So, how do we select a target? The target triple is made up of a few fields defined here. In our case, we need to know what processor we are working with. This is the first of many times we are going to look in the datasheet. Open this and bookmark it because we are going to be using it constantly:
https://datasheets.raspberrypi.com/rp2040/rp2040-datasheet.pdf
On page 10, the processor is listed as Dual ARM Cortex-M0+ @ 133MHz
. The ARM website defines this processor as using the Armv6-M
architecture with the Thumb or Thumb-2 subset
instruction set. With that information, we can look into the platform support page to find our target. The thumbv6m-none-eabi target is the correct instruction set and architecture, and it lists our processor as a supported processor.
Compiling to a Target
Now that we have our target thumbv6m-none-eabi
, let's actually use it. While you could use the command line arguments for all of this, it is much more convenient to use a config file:
.cargo/config.toml
[build]
target = "thumbv6m-none-eabi"
Just compiling the program to the correct instruction set is not enough, however. We need to properly place the program data in memory. We do this through the linker.
Understanding the Hardware
Before we move on to linking our program, we need to understand what we are working with on our dev board. Note that the RP2040
is not the board you have in front of you, but instead it is just the small chip at the center of it. There are other supporting components that facilitate it doing its job, and one of those is external flash memory. The following datasheet is for the Pi Pico itself, not the RP2040 microcontroller:
https://datasheets.raspberrypi.com/pico/pico-datasheet.pdf
This datasheet will be useful for things such as power supply specifications, seeing what external supporting components are used, and understanding how our microcontroller pins are broken out to pins that we can actually use on our dev board.
The Boot Sequence
Finding section 2.8.1 of the RP2040 datasheet gives us a solid idea of what the boot sequence is going to look like. We see that the controller is going to pull 256B out of flash memory first before it enters the "flash second stage" and executes the code that was just retrieved. The first 256B are known as the second-stage boot loader, and its job is to ensure that the processor is set up to read from the external flash memory. This is necessary because there are many different options for memory chips that a designer could choose, each with slightly different ways to access their contents.
However, you may have noticed a slight logical inconsistently, namely that the microcontroller is reading from flash memory to get the instructions it needs to read from flash memory. Section 2.8.1.2 outlines the commands that are sent via SPI to the flash memory, it is up to the designer to select a memory chip that will respond favorably to this sequence of commands. The commands selected here are the generalized and are consequently less efficient than what is possible with SPI flash memory chips, the second-stage boot loader can reconfigure the memory chip to run in its fastest configuration.
A second-stage bootloader for the Pi Pico's memory chip, the W25Q080, is openly available. The crate rp2040-boot2
provides that bootloader in a convenient wrapping. If you'd like to read what goes into those 256B, it is all here.
To use the rp2040-boot2
crate, we need to include some code in our main.rs
and Cargo.toml
files:
main.rs
#![allow(unused)] #![no_std] #![no_main] fn main() { #[link_section = ".boot_loader"] #[used] pub static BOOT_LOADER: [u8; 256] = rp2040_boot2::BOOT_LOADER_W25Q080; }
Cargo.toml
[package]
edition = "2021"
name = "A-minimal-flash"
version = "0.1.0"
[dependencies]
rp2040-boot2 = "0.3"
Linking
By providing the cortex-m-rt
crate with a memory.x
file and a few attributes in our program, it will produce a linker script that will properly place our program data for us.
Memory Layout
The memory.x
file needed by cortex-m-rt
is a description of how to layout the address space of our program. The rp2040-boot2
crate that is needed to supply the second-stage bootloader also provides an example memory.x
file:
memory.x
from rp2040-boot2
docs
MEMORY
{
/* To suit Raspberry Pi RP2040 SoC */
BOOT_LOADER : ORIGIN = 0x10000000, LENGTH = 0x100
FLASH : ORIGIN = 0x10000100, LENGTH = 2048K - 0x100
RAM : ORIGIN = 0x20000000, LENGTH = 264K
}
SECTIONS {
/* ### Boot loader */
.boot_loader ORIGIN(BOOT_LOADER) :
{
KEEP(*(.boot_loader*));
} > BOOT_LOADER
} INSERT BEFORE .text;
If you were wondering where these numbers came from, see section 2.2.1 in the datasheet. Making decisions about how to layout memory is a complex topic that is out of the scope of this course.
Entry Point
The entry point of your program in an embedded environment is telling the linker that the function's code needs to be placed at whatever address will be in the PC
(program counter) register when the initialization code is done. The PC
register is a register internal to the CPU that holds the address of the instruction it is currently executing. As instructions are executed, the PC
register is incremented or, in the case of some control flow, directly modified.
With the cortex-m-rt
crate, the entry point is easy to set. The cortex-m
crate can also be used to access assembly instructions for an empty busy loop.
main.rs
... #[link_section = ".boot2"] #[used] pub static BOOT_LOADER: [u8; 256] = rp2040_boot2::BOOT_LOADER_W25Q080; #[entry] fn main() -> ! { loop { nop(); } }
Note: the !
type in Rust represents the never
type, indicating that that type will never be realized because the function does not return.
Cargo.toml
...
[dependencies]
cortex-m = "0.7"
cortex-m-rt = "0.7"
rp2040-boot2 = "0.3"
Including Linker Scripts
In the Cargo config file, we need to add some configurations to ensure the linker script from cortex-m-rt
is included:
.cargo/config.toml
[build]
target = "thumbv6m-none-eabi"
rustflags = [
"-C", "link-arg=-Tlink.x",
"-C", "link-arg=--nmagic",
]
The script link.x
is generated at compile time and changes as we update the code. nmagic
disables page alignment, see this for more detail or this for more Rust specific details.
Panic Handling
The last thing we need to do is solve this error:
error: `#[panic_handler]` function required, but not found
Panic
ing is Rust's way of crashing a program in a controlled way. Generally, this involves unwinding the call stack to give the user a backtrace when debugging. But that doesn't mean anything on an embedded system where there is no standard output, much less a place to put logs. So what should be done in the case of a panic? In production, it may be best to send a signal to an external debug probe, trigger a processor reboot, or print some kind of message over a UART port.
However, for now, the crate panic-halt
is a great option. It implements the bare minimum needed to compile with a valid panic handler. Its source code in generally just an infinite loop with extra stuff.
Including this panic handler looks like this in practice:
main.rs
#![allow(unused)] fn main() { ... use cortex_m::asm::nop; use cortex_m_rt::entry; use panic_halt as _; #[link_section = ".boot2"] #[used] pub static BOOT_LOADER: [u8; 256] = rp2040_boot2::BOOT_LOADER_W25Q080; ... }
Cargo.toml
...
[dependencies]
...
panic-halt = "0.2"
Flashing Our Code
Most of the heavy lifting for this has already been done in the first chapter. If you haven't completed chapter one, ensure you go back and follow those instructions to make sure all of your hardware and software are communicating appropriately.
Using probe-rs
First, let's try to use probe-rs
by itself by running these commands:
cargo build
probe-rs run --chip RP2040 --protocol swd target/thumbv6m-none-eabi/debug/A-minimal-flash
If no error messages popped up, it probably worked, though it is not easy to tell because this code doesn't do anything yet.
Using a Cargo Runner
Typing out that command each time is cumbersome. It would be better if we could just use cargo run
like normal. This is possible by changing what cargo uses as the runner
.
.cargo/config.toml
[target.'cfg(all(target_arch = "arm", target_os = "none"))']
runner = "probe-rs run --chip RP2040 --protocol swd"
[build]
target = "thumbv6m-none-eabi"
This configuration specifies that any target that uses an ARM-based architecture with no OS will use this runner command. While it is out of the scope for this overview, it is possible to define multiple targets each with their own configuration. For example, if you wanted to run a QEMU instance with your code for your given architecture for emulated testing.
Now it should be possible to use this command to build and flash the code:
cargo run
Build Profiles
Profiles are how Rust allows users to save certain compiler configurations for different stages of development. For example, the two default profiles are debug
and release
. When running cargo run
with no other options, it automatically follows the debug
profile, which optimizes less, does not strip symbols, and includes all runtime safety checks. If you would like to use the release
profile, you can use cargo run --release
. In release
mode, optimizations take longer and most runtime checks are disabled. Configuration of these profiles is in the Cargo.toml
file.
Final Code for a Minimal Flash
With all of that done, you can take a look at the A-minimal-flash
directory in this repository. This directory contains all of the work done in this section along with some other tweaks that make life easier, such as a rust-analyzer.json
file that disables some LSP errors.
Read through the new files and feel free to ask any questions you may have about them.
Interacting With Peripherals
Hardware Selection
Modern microcontrollers are selected based on a long list of factors, with each factor being weighted differently depending on the engineer and project. Some of the major considerations include:
Available Peripherals
A unique piece of hardware internal to the microcontroller that can affect the developer experience, common examples include:
- UART - Universal Asynchronous Receiver / Transmitter
- SPI/QSPI - [Quad] Serial Peripheral Interface
- I2C - Inter-Integrated Circuit
- CAN Bus - Controller Area Network bus
- USB - Universal Serial Bus
- Clocks/timers
There are countless different peripherals available for microcontrollers, all with different niche uses. An engineer in the automotive space will likely prioritize the CAN Bus and USB to interface with a driver's smartphone.
Processor Capability
What the processor on the microcontroller is able to do and what operating modes are available. Certain applications may require a multi-core microcontroller; others may demand a microcontroller that has deep sleep and/or low power modes. This category can also include general performance requirements. Some applications require significant data throughput from one peripheral to another, post-processing, or low latency.
Price
Often not a huge concern to the hobbyist, but price can make or break a project's feasibility. Consider Apple's sales figures as an example. A change in price of one component by $0.01 at 90,000,000 units a year equates to $900,000 off their bottom line, and $0.05 equates to a $4,500,000 loss. This level of price optimization has to be performed for every distinct component in the product along with the manufacturing methods.
Development Ecosystem/Developer Experience
How easy is it to program the microcontroller? This includes factors such as library availability, documentation, and IDE support. Another major consideration is the developer's level of experience working with a specific platform or vendor, as learning a new architecture can be time-consuming.
Working With GPIO
For this example project, our goal is to turn a built-in LED on and off. To accomplish this, we need to properly configure the GPIO and SIO modules of the Pi Pico. Using the board datasheet:
https://datasheets.raspberrypi.com/pico/pico-datasheet.pdf
We can determine that the LED is wired into GP25
. Using the controller datasheet section 1.4.3. GPIO Functions, we can see what that pin is capable of. Pins can have multiple functions that the user can select; this selection process is referred to as multiplexing or muxing, after the digital component by the same name.
GPIO | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 |
---|---|---|---|---|---|---|---|---|---|
25 | SPI1 CSn | UART1 RX | I2C0 SCL | PWM4 B | SIO | PIO0 | PIO1 | CLOCK GPOUT3 | USB VBUS DET |
Of these different functions, a few could do what we want:
SIO
PIO0
&PIO1
CLOCK GPOUT3
The table below the function table lists the descriptions of each of the different peripherals available. Using the SIO module is going to be the easiest way to accomplish our LED blinking goal, even if it does not leverage the full ability of the processor.
Selecting SIO
Of the functions listed in the table above, we need to select the SIO
module to drive GP25
. The column that SIO
is in is F5
, meaning we need to write a 5
to the function select field of the control register for GP25
. To find the function select field, we need to look in section 2.19.6.1. IO - User Bank. This section specifies that the IO_BANK0_BASE
registers begin at an offset of 0x4001_4000
. The GPIO0_CTRL
register is offset 0x0cc
from the base address. Representing that in code would be:
#[entry] fn main() -> ! { // GPIO control const IO_BANK0_BASE: u32 = 0x4001_4000; const GPIO25_CTRL: *mut u32 = (0x0000_00CC + IO_BANK0_BASE) as *mut u32; ...
Configuring this register will require reading the current value, modifying it, then writing it again. The volatile read and write options are unsafe
operations in Rust as they rely on the programmer to ensure the addresses are correct, the type specified matches the data at that address, and that the value once read is handled properly and not destroyed before its final use. The code to modify the GPIO25_CTRL
register function field is:
#![allow(unused)] fn main() { // Setting the GPIO 25 control register to be driven by the SIO module unsafe { let mut gpio25_ctrl: u32 = read_volatile(GPIO25_CTRL); gpio25_ctrl &= 0xCFFC_CCE0; // Clearing non-reserved gpio25_ctrl |= 0x0000_0005; // Setting function to F5 -> SIO write_volatile(GPIO25_CTRL, gpio25_ctrl); } }
Note the use of the bitwise assignment operators &=
and |=
to clear the non-reserved values to their non-inverted states, then set the function select to 5 as described in the earlier table. The common terminology here comes from digital logic, to set means to change a value to 1
, and to clear is to change a value to 0
.
Configuring SIO
The base address of the SIO
registers is defined in section 2.3.1.7. List of Registers as 0xd000_0000
, referred to as SIO_BASE
. The layout of these registers is different from the GPIOXX_CTRL
register modified earlier. Each register is a 32-bit value with each bit associated with a particular pin. To enable output on the selected GP25
pin, we need the GPIO_OE
register with an offset of 0x020
. Because we are concerned with GP25
, we need the 25th bit in the register to be set. The following code enables output on the selected pin:
#![allow(unused)] fn main() { // SIO control const SIO_BASE: u32 = 0xD000_0000; const GPIO_OE: *mut u32 = (0x0000_0020 + SIO_BASE) as *mut u32; // Enabling output on GPIO 25 unsafe { let mut gpio_oe = read_volatile(GPIO_OE); gpio_oe |= 0x1 << 25; write_volatile(GPIO_OE, gpio_oe); } }
Driving The Pin
Finally, we need to drive the output on GP25
and we do that with the OUT
registers. In this case, we will use the XOR
variant because we are always toggling the state. This will save us from having to manage the state ourselves in software. The GPIO_OUT_XOR
register has an offset of 0x01c
and we will modify it with this code:
#![allow(unused)] fn main() { const GPIO_OUT_XOR: *mut u32 = (0x0000_001C + SIO_BASE) as *mut u32; loop { // Toggle output level of GPIO 25 unsafe { write_volatile(GPIO_OUT_XOR, 0x1 << 25); } } }
Delay
If we upload this code as it is now, the LED will turn on but will appear to be lit to about half brightness. That is because the microcontroller is executing the instructions in that loop as fast as possible with no delay. To us, it appears the LED is just less bright, but it is actually flickering based on the clock speed of our microcontroller.
To avoid this, let's create a delay function. We can utilize the cortex_m::asm::nop();
function, which will invoke the nop
assembly instruction, wasting a clock cycle. But we need a way to determine how many of those assembly instructions we need, and we also want to utilize a loop instead of baking 12 million delay instructions into the binary. Appendix B
of the board datasheet shows that the external clock included on the Pi Pico is 12MHz, meaning that for a 1 second delay we need to waste 12 million clock cycles. Here is an example of a simple delay function:
#![allow(unused)] fn main() { #[inline(always)] fn delay_s(s: u32) { const EXTERNAL_XTAL_BASE_FREQ: u32 = 12_000_000; let cycles = s * EXTERNAL_XTAL_BASE_FREQ; for _ in 0..cycles { cortex_m::asm::nop(); } } }
Conclusions
The code for the full blinking example can be found in the B-minimal-blinky
directory. This chapter explored accessing memory-mapped registers to use peripherals. This direct approach is time-consuming and error-prone, both in programming and in research required poring over datasheets searching for addresses. The programmer must also be absolutely certain that all of the invariants are met for different hardware configurations. This is trivial at first, but as your program grows in scope, it quickly gets out of hand. Because of this, developers often use BSPs or Board Support Packages. These pieces of software wrap around the direct memory access in a way that prevents improper usage (to a degree) and allows for a more readable program. BSPs also have the added benefit of detaching the functionality you programmed into your project from the hardware it is running on, allowing for more portability. Later chapters will cover the BSP available for the Pi Pico and the levels of abstraction it is built upon.
The next chapter will be a brief look into RTT and GDB as tools that we can leverage to make embedded development easier.
An Aside For RTT And GDB
RTT - Real Time Transfer
The output you have been seeing on the console as you are running the previous examples is read from the microcontroller using RTT. RTT is a protocol developed by Segger to aid in debugging microcontrollers without the need to use up a peripheral such as UART. RTT works by allocating a circular buffer in RAM that is constantly being read by the debug probe. In order to write a string into RTT all the microcontroller has to do is copy the log string to that region of RAM which is orders of magnitude faster than passing it through a UART peripheral.
Printing to RTT
In a practical sense, printing to RTT is the same as most other debug methods. In Rust, the defmt
crate is used to give users a standardized experience that mimics the standard log
and tracing
format.
Using RTT in our program is as simple as calling one of the log level macros such as
main.rs
#![allow(unused)] fn main() { ... info!("Hello from RP2040!"); ... }
Configuring RTT
The astute among you may have noticed that I mentioned a region of RAM in the explantation earlier. Any time we need to have a region of memory that both the microcontroller and the programming machine know about, some kind of configuration must be done. In this case the defmt
crate generates a linker script.
.cargo/config.toml
rustflags = [
...
"-C", "link-arg=-Tdefmt.x",
...
]
Another configuration that can be done is to determine which log level should be printed. This generally defaults to info
for most applications.
.cargo/config.toml
...
[env]
DEFMT_LOG = "trace"
...
and
Embed.toml
[default.general]
chip = "RP2040"
log_level = "WARN"
# RP2040 does not support connect_under_reset
connect_under_reset = false
[default.rtt]
enabled = false
up_mode = "NoBlockSkip"
timeout = 3000
Note that whatever level you define in .cargo/config.toml
is the lowest level that will be compiled into the program. The level defined in Embed.toml
is the level that will be printed by the local RTT client. This can give the developer to include verbose logging that may adversely affect runtime performance that is only enabled when it is needed. The log levels from most verbose to least are as follows:
TRACE
DEBUG
INFO
WARN
ERROR
GDB - GNU Debugger
To get started with GDB you will need to download GDB for the architecture we are targeting. In the case of ARM Thumb v6 it can be done with:
sudo apt install gdb-multiarch
We also need a local proxy for GDB to bind to, which cargo-embed
can do if configured properly. We also need to set the reset behavior to halt so the controller will not start executing code before the debugger is attached.
Embed.toml
...
[default.reset]
enabled = true
halt_afterwards = true
...
[default.gdb]
enabled = true
gdb_connection_string = "127.0.0.1:2345"
...
Now we can run this configuration with cargo embed
note that cargo run
is configured to simply flash the program and does not start the GDB stub. Once you have run cargo embed
you should see this line:
GDB stub listening at 127.0.0.1:2345
In another terminal connect to that stub using gdb-multiarch
, passing in our executable so that gdb has a framework to supply useful information from.
gdb-multiarch target/thumbv6m-none-eabi/debug/B-minimal-blinky
Once gdb has started you will be presented with a command line interface with a (gdb)
prompt. To connect to the stub and list the contents of the registers use these commands.
(gdb) target remote :2345
(gdb) info registers
Full use of gdb is out of the scope of this writing but the following are some useful commands for reference:
target remote :2345
- binds to gdb stub on localhost port 2345info registers
ori r
- prints contents of the CPU registers to the consolecontinue
orc
- continue program execution until a signal is sent e.g. Ctrl+C or a breakpoint is hitdisassemble
- show disassembly of the current block that the PC points toset print asm-demangle on
- clean up assembly outputbreak main.rs:41
- place a breakpoint on line 41 ofmain.rs
Levels of abstraction
Justifications For Abstraction
As you may have noticed, it is extremely cumbersome and error prone to directly interface with the hardware registers. Programming in this way requires knowledge of the specific chipset you are working on and the board in which it is being deployed to. If the chipset were to change even slightly, even to a model that has the same processor cores, it is likely that the addresses associated with hardware registers would be different.
The way to get around these issues is through layers of abstraction, for the Raspberry Pi Pico the layers are as follows:
- Peripheral Access Crate (PAC)
- RP Hardware Abstraction Layer (HAL)
- Based on this generalized interface: Embedded HAL
- Board Support Package (BSP)
The following flowchart displays how these crates interact with the BSP being the highest level of abstraction and the chip itself being displayed by its individual block components.
flowchart LR BSP[" BSP: rp-pico "] HAL[" HAL: rp-hal "] cortex[" Microarchitecture Crates: cortex-m cortex-m-rt "] PAC[" PAC: rp-pac "] BSP --- HAL HAL --- cortex HAL --- PAC cortex --- processor PAC --- Peripherals subgraph "RP2040 Chip" subgraph processor["Dual ARM Cortex-M0+"] direction LR core0["Core 0"] core1["Core 1"] end subgraph "Peripherals" direction LR I2C UART ADC Flash[Flash Memory] etc. end end
Programming with a BSP
The first step of starting with this BSP is importing all of the necessary new dependencies into our project:
Cargo.toml
...
[dependencies]
# For ARM M-Series microcontrollers
cortex-m = "0.7"
cortex-m-rt = "0.7"
embedded-hal = { version = "1.0.0" }
# Debug probe printing
defmt = "0.3"
defmt-rtt = "0.4"
# Panic handler
panic-probe = { version = "0.3", features = ["print-defmt"] }
# Board support package (BSP)
rp-pico = "0.9"
...
With the addition of a BSP we are also adding support for RTT. With RTT and SWD we can switch from panic-halt
, which gives no information about why a panic happened to the developer, to panic-probe
which takes advantage of the existence of the debug probe to display the cause of panics and a stack trace.
main.rs
#![no_std] #![no_main] use bsp::entry; use defmt::info; use defmt_rtt as _; use embedded_hal::digital::OutputPin; use panic_probe as _; use rp_pico as bsp; use bsp::hal::{ clocks::{init_clocks_and_plls, Clock}, pac, sio::Sio, watchdog::Watchdog, }; #[entry] fn main() -> ! { ... }
Note how we aliased rp_pico
to bsp
, this is to support portability between multiple boards. If you were switching to another board, e.g. the RP Pico W, you would only have to update that alias and your included dependencies.
Next is to start to instantiate the different data structures we are going to use. Due to Rust's strict ownership requirements, care must be taken to who has ownership of any single hardware resource at a time.
main.rs
... #[entry] fn main() -> ! { let mut pac = pac::Peripherals::take().unwrap(); let core = pac::CorePeripherals::take().unwrap(); let mut watchdog = Watchdog::new(pac.WATCHDOG); let sio = Sio::new(pac.SIO); ... }
Notice how the PAC first takes ownership of all of the hardware peripherals, this crate contains all of the information such as memory addresses but none of the information about how to utilize these resources. The PAC is, in simple terms, a crate that closely models the data sheet's representation of the hardware, this includes splitting most of the registers into named bitfields and giving convenient ways to access them. However, the PAC does not even attempt to stop you from doing something like enabling incompatible features, that is done by the HAL.
Next we will use the data structures that are now owned by the PAC to setup the clock for use in a delay later.
main.rs
#[entry] fn main() -> ! { ... // External high-speed crystal on the pico board is 12Mhz let external_xtal_freq_hz = 12_000_000u32; let clocks = init_clocks_and_plls( external_xtal_freq_hz, pac.XOSC, pac.CLOCKS, pac.PLL_SYS, pac.PLL_USB, &mut pac.RESETS, &mut watchdog, ) .ok() .unwrap(); let mut delay = cortex_m::delay::Delay::new(core.SYST, clocks.system_clock.freq().to_Hz()); ... }
This setup ensures that the clocks are properly configured and exposed a data structure delay
that wraps that functionality in something is simple to use.
Next is doing the same thing for the LED output pin, remember the work that needed to be done to set the pin as an output before it was even driven high or low. Notice how that setup is handled via Rust's type system.
#[entry] fn main() -> ! { ... let pins = bsp::Pins::new( pac.IO_BANK0, pac.PADS_BANK0, sio.gpio_bank0, &mut pac.RESETS, ); let mut led_pin = pins.led.into_push_pull_output(); ... }
With this, the led_pin
is configured in hardware to be ready for a push/pull output. If this operation was not done, the methods to operate on that pin would not exist as it would be the incorrect type. By leveraging Rust's type system, improperly configuring a pin becomes significantly more difficult than miscalculating a bitmask.
Finally, we will use the newly constructed led_pin
struct to control the on-board LED.
#[entry] fn main() -> ! { ... loop { info!("on!"); led_pin.set_high().unwrap(); delay.delay_ms(500); info!("off!"); led_pin.set_low().unwrap(); delay.delay_ms(500); } ... }
With this we can finally control the on-board LED without having to dive into pages of datasheets or hand calculate offsets and bitmasks. We also can rest easy that the configuration of our processor and output pins is correct as proven by the type system.
Conclusions
I hope you can see the pattern emerging when comparing the first "bare metal" binky program to this one written with the BSP. It is important to remember that the hardware changes that were done by hand still need to be done whether we see them or not, in this case we are configuring those same hardware elements for an output pin when we are converting between types. Depending on your application this level of abstraction may be intrusive and too vague for what you need, perhaps you need to know with 100% certainty the specific number of clock cycles between when a pin is configured to an input and when it is read. If that is a requirement of your work then you have no choice but to program at a level that gives you access to that data. However, for many projects, doing our best to ignore the fact that we are running code on a real physical device is beneficial. It reduces the cognitive load of the developer and allows for more portable applications that can be easily moved to new microcontrollers.
Conclusions
I hope that this brief walkthrough has been insightful. As I was going through this process I had multiple more experienced embedded developers ask "Will you be covering X topic?" and oftentimes my answer was no. There is much more content to cover if you want to do real work with microcontrollers, here is a brief list of things I would like to cover if there is sufficient interest in the future:
- Using the second core of the RP2040
- Using another peripheral such as the ADC without a BSP
- Exploring abstraction with just a HAL instead of skipping directly to the BSP
- Interfacing with another device such as a I2C temperature sensor
My goal for this project was to briefly bridge the gap between application programming and embedded systems for those that already had a solid foundation in programming. If there are any concepts you feel were not explained or left you confused please reach out to me directly so I can provide more context.
Thanks for reading!