Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Commit

Permalink
Merge #302
Browse files Browse the repository at this point in the history
302: Make stack painting fast again! 🇪🇺  r=jonathanpallant a=Urhengulas

This PR implements the first one of improvements outlined in #258.

Fixes #258.

## But what is "stack painting" anyways?

The idea is to write a specific byte pattern to (part of) the stack before the program is getting executed. After the program finished, either because it is done with its task, or because there was an error, we read out the previously painted area and check how much of it is still intact. If the pattern is still the same, we can be rather certain that the program didn't write to this part of the stack. This information helps to either know if there was a stack overflow, or just to measure how much of the stack was used.

So far both reading and writing of the memory was done via the probe. While this works it is also rather slow, because the host and probe communicate via USB which takes time.

The new approach is writing a subroutine to the MCU, which will paint the memory from within.

## Mesurements

In following table you can see the measurement how much time the old and new approach take for memory from 8 to 256KiB.

![data](https://user-images.githubusercontent.com/37087391/154973187-c17e66f7-cb22-4e56-8dff-a9798ab3a39a.png)

The results are pretty impressive. The new approach is about 170 times faster!

## Further work

- A similar approach can also be applied to reading out the stack after the program finished.
- Additionally the stack canary can be simplified quite a lot. So far we are not painting the whole stack, except the user asks for it, because this _was_ slow. Because it is fast now we can always paint all of it, which simplifies the code and removes the need for the `--measure-stack` flag.

Co-authored-by: Johann Hemmann <[email protected]>
  • Loading branch information
bors[bot] and Urhengulas authored Feb 25, 2022
2 parents 2dde311 + 67c1576 commit 12228a0
Show file tree
Hide file tree
Showing 2 changed files with 94 additions and 13 deletions.
103 changes: 92 additions & 11 deletions src/canary.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
use std::time::Instant;

use crate::TIMEOUT;
use crate::{Elf, TargetInfo};
use probe_rs::{MemoryInterface, Session};
use probe_rs::{Core, MemoryInterface, Session};

use crate::{registers::PC, Elf, TargetInfo, TIMEOUT};

const CANARY_VALUE: u8 = 0xAA;

Expand Down Expand Up @@ -45,6 +45,7 @@ pub(crate) struct Canary {
}

impl Canary {
/// Decide if and where to place the stack canary.
pub(crate) fn install(
sess: &mut Session,
target_info: &TargetInfo,
Expand All @@ -54,10 +55,8 @@ impl Canary {
let mut core = sess.core(0)?;
core.reset_and_halt(TIMEOUT)?;

// Decide if and where to place the stack canary.

let stack_info = match &target_info.stack_info {
Some(range) => range,
Some(stack_info) => stack_info,
None => {
log::debug!("couldn't find valid stack range, not placing stack canary");
return Ok(None);
Expand All @@ -69,7 +68,8 @@ impl Canary {
return Ok(None);
}

let stack_available = stack_info.range.end() - stack_info.range.start() - 1;
let stack_start = *stack_info.range.start();
let stack_available = *stack_info.range.end() - stack_start;

let size = if measure_stack {
// When measuring stack consumption, we have to color the whole stack.
Expand Down Expand Up @@ -97,10 +97,8 @@ impl Canary {
size_kb
);
}
let address = *stack_info.range.start();
let canary = vec![CANARY_VALUE; size];
let start = Instant::now();
core.write_8(address, &canary)?;
paint_stack(&mut core, stack_start, stack_start + size as u32)?;
let seconds = start.elapsed().as_secs_f64();
log::trace!(
"setting up canary took {:.3}s ({:.2} KiB/s)",
Expand All @@ -109,7 +107,7 @@ impl Canary {
);

Ok(Some(Canary {
address,
address: stack_start,
size,
stack_available,
data_below_stack: stack_info.data_below_stack,
Expand Down Expand Up @@ -186,3 +184,86 @@ impl Canary {
}
}
}

/// Write [`CANARY_VALUE`] to the stack.
///
/// Both `start` and `end` need to be 4-byte-aligned.
fn paint_stack(core: &mut Core, start: u32, end: u32) -> Result<(), probe_rs::Error> {
assert!(start < end, "start needs to be smaller than end address");
assert_eq!(start % 4, 0, "`start` needs to be 4-byte-aligned");
assert_eq!(end % 4, 0, "`end` needs to be 4-byte-aligned");

// does the subroutine fit inside the stack?
let stack_size = (end - start) as usize;
assert!(
SUBROUTINE_LENGTH < stack_size,
"subroutine doesn't fit inside stack"
);

// write subroutine to RAM
// NOTE: add `SUBROUTINE_LENGTH` to `start`, to avoid the subroutine overwriting itself
core.write_8(start, &subroutine(start + SUBROUTINE_LENGTH as u32, end))?;

// store current PC and set PC to beginning of subroutine
let previous_pc = core.read_core_reg(PC)?;
core.write_core_reg(PC, start)?;

// execute the subroutine and wait for it to finish
core.run()?;
core.wait_for_core_halted(TIMEOUT)?;

// overwrite subroutine
core.write_8(start, &[CANARY_VALUE; SUBROUTINE_LENGTH])?;

// reset PC to where it was before
core.write_core_reg(PC, previous_pc)?;

Ok(())
}

/// The length of the subroutine.
const SUBROUTINE_LENGTH: usize = 28;

/// Create a subroutine to paint [`CANARY_VALUE`] from `start` till `end`.
//
// Roughly corresponds to following assembly:
//
// 00000108 <start>:
// 108: 4803 ldr r0, [pc, #12] ; (118 <end+0x2>)
// 10a: 4904 ldr r1, [pc, #16] ; (11c <end+0x6>)
// 10c: 4a04 ldr r2, [pc, #16] ; (120 <end+0xa>)
//
// 0000010e <loop>:
// 10e: 4281 cmp r1, r0
// 110: d001 beq.n 116 <end>
// 112: c004 stmia r0!, {r2}
// 114: e7fb b.n 10e <loop>
//
// 00000116 <end>:
// 116: be00 bkpt 0x0000
// 118: 20000100 .word 0x20000100 ; start
// 11c: 20000200 .word 0x20000200 ; end
// 120: aaaaaaaa .word 0xaaaaaaaa ; pattern
fn subroutine(start: u32, end: u32) -> [u8; SUBROUTINE_LENGTH] {
// convert start and end address to bytes
let [s1, s2, s3, s4] = start.to_le_bytes();
let [e1, e2, e3, e4] = end.to_le_bytes();

const CV: u8 = CANARY_VALUE;
[
0x03, 0x48, // ldr r0, [pc, #12]
0x04, 0x49, // ldr r1, [pc, #16]
0x04, 0x4a, // ldr r2, [pc, #16]
// <loop>
0x81, 0x42, // cmp r1, r0
0x01, 0xD0, // beq.n 116 <end>
0x04, 0xC0, // stmia r0!, {r2}
0xFB, 0xE7, // b.n 10e <loop>
// <end>
0x00, 0xBE, // bkpt 0x0000
//
s1, s2, s3, s4, // .word ; start address
e1, e2, e3, e4, // .word ; end address
CV, CV, CV, CV, // .word ; canary value
]
}
4 changes: 2 additions & 2 deletions src/target_info.rs
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,8 @@ fn extract_stack_info(elf: &Elf, ram_range: &Range<u32>) -> Option<StackInfo> {

let initial_stack_pointer = elf.vector_table.initial_stack_pointer;

// SP points one past the end of the stack.
let mut stack_range = ram_range.start..=initial_stack_pointer - 1;
// SP points one word (4-byte) past the end of the stack.
let mut stack_range = ram_range.start..=initial_stack_pointer - 4;

for section in elf.sections() {
let size: u32 = section.size().try_into().expect("expected 32-bit ELF");
Expand Down

0 comments on commit 12228a0

Please sign in to comment.