Debugging a Crash
Your firmware crashed. The LED stopped blinking, the UART went silent, and the target is frozen. This tutorial walks through diagnosing an ARM Cortex-M HardFault using mcjtag’s register and memory tools.
By the end, you will know how to read fault registers, decode the exception stack frame, and identify the instruction that caused the crash.
Step 1: Halt the target
Section titled “Step 1: Halt the target”target_control("halt")If the CPU is already halted from the fault, this is a no-op. Check with:
target_state()The response shows state="halted" and the current pc. If the target is sitting
in a fault handler, the pc will point somewhere inside that handler’s code — not at
the faulting instruction itself. We need the stack frame for that.
Step 2: Read the fault registers
Section titled “Step 2: Read the fault registers”read_registers()Three registers matter most right now:
pc — Where execution stopped. If you are inside a HardFault handler, this is the handler’s code, not the original fault location.
lr — The link register. During an exception, ARM loads a special EXC_RETURN pattern into LR. Common values:
| LR value | Meaning |
|---|---|
0xFFFFFFF1 | Return to handler mode, use MSP |
0xFFFFFFF9 | Return to thread mode, use MSP |
0xFFFFFFFD | Return to thread mode, use PSP |
The bottom bits tell you which stack pointer was in use when the fault occurred, which you need for the next step.
xPSR — The program status register. Bits 0-8 contain the ISR number:
| ISR Number | Exception |
|---|---|
| 0 | Thread mode (no exception) |
| 2 | NMI |
| 3 | HardFault |
| 4 | MemManage |
| 5 | BusFault |
| 6 | UsageFault |
ISR=3 confirms you are in a HardFault handler.
Step 3: Decode the exception frame
Section titled “Step 3: Decode the exception frame”When a Cortex-M takes an exception, the hardware pushes 8 registers onto the active stack before entering the handler. This is called the exception stack frame.
First, determine which stack pointer was active. If LR contains 0xFFFFFFFD, the
PSP was in use — read the psp register. Otherwise, use msp:
read_registers(names=["msp", "psp"])Then read 8 words from that stack pointer value:
read_memory("<sp_value>", count=8)The stack frame layout (lowest address first):
| Offset | Register | What it tells you |
|---|---|---|
| +0x00 | r0 | First argument to the faulting function |
| +0x04 | r1 | Second argument |
| +0x08 | r2 | Third argument |
| +0x0C | r3 | Fourth argument |
| +0x10 | r12 | Scratch register |
| +0x14 | lr | Return address before the exception (caller of the faulting function) |
| +0x18 | pc | Address of the faulting instruction |
| +0x1C | xPSR | Status flags at the time of the fault |
The pc at offset +0x18 is the actual instruction that caused the fault. This is the address you want to look up in your .map file or disassembly.
The lr at offset +0x14 tells you who called the function that faulted.
Step 4: Check the Fault Status Registers
Section titled “Step 4: Check the Fault Status Registers”The System Control Block (SCB) contains registers that explain why the fault occurred. If you have an SVD file loaded:
svd_inspect(peripheral="SCB")Or read them directly by address:
read_memory("0xE000ED28", count=4)The four words decode as:
| Address | Register | Purpose |
|---|---|---|
0xE000ED28 | CFSR | Configurable Fault Status Register (UsageFault + BusFault + MemManage) |
0xE000ED2C | HFSR | HardFault Status Register |
0xE000ED34 | MMFAR | MemManage Fault Address (valid only if CFSR.MMARVALID is set) |
0xE000ED38 | BFAR | Bus Fault Address (valid only if CFSR.BFARVALID is set) |
CFSR is the most informative. It is actually three sub-registers packed into 32 bits:
| Bits | Sub-register | Fault type |
|---|---|---|
| 0-7 | MMFSR | Memory management faults (MPU violations, stack overflow) |
| 8-15 | BFSR | Bus faults (invalid address on the bus) |
| 16-31 | UFSR | Usage faults (undefined instruction, unaligned access, divide by zero) |
HFSR tells you if the HardFault was forced (bit 30) — meaning a configurable fault escalated to HardFault because its handler was disabled or a secondary fault occurred during exception handling.
Step 5: Common fault causes
Section titled “Step 5: Common fault causes”Once you have the CFSR bits and the faulting pc, the cause usually falls into one of these patterns:
Bus fault at an invalid address (BFSR.PRECISERR + BFAR)
The CPU tried to access a memory address that does not exist or is not mapped. Common causes:
- Null pointer dereference (BFAR near 0x00000000)
- Dereferencing a freed or corrupted pointer
- Accessing a peripheral whose clock is not enabled
Usage fault with UNDEFINSTR (UFSR bit 0)
The CPU fetched something that is not a valid instruction. Common causes:
- Corrupted function pointer (jumped to data instead of code)
- Stack overflow overwrote the return address
- Missing Thumb bit in a branch target (address should be odd for Thumb code)
MemManage fault (MMFSR)
An MPU (Memory Protection Unit) violation occurred. Common causes:
- Writing to a read-only region
- Executing code from a non-executable region
- Stack overflow past the MPU guard region
Forced HardFault (HFSR bit 30)
The original fault was a BusFault, UsageFault, or MemManage, but the corresponding handler was not enabled (the SCB.SHCSR enable bits were clear), so it escalated to HardFault. Check the CFSR to see which underlying fault triggered it.
Step 6: Using the debug_crash prompt
Section titled “Step 6: Using the debug_crash prompt”mcjtag includes a debug_crash prompt that automates this entire workflow. When
you invoke it, the LLM client will:
- Halt the target
- Read pc, lr, sp, and xPSR
- Determine the active stack pointer
- Read the exception frame
- Check the fault status registers
- Report the crash location, call chain, and likely cause
This is a good starting point. For complex faults (double faults, stack overflows that corrupted the frame, or faults during interrupt processing), you may need to walk through the steps manually and examine additional context.
Next steps
Section titled “Next steps”- Hardware Setup — make sure your wiring and OpenOCD config are correct before debugging
- SVD Register Decoding — decode SCB and other system peripherals with full bitfield names and descriptions
- Safety Configuration — understand the memory write protections that prevent accidental flash corruption during debugging