Skip to content
This repository has been archived by the owner on Jan 24, 2022. It is now read-only.

[WIP] place static vars at the end of RAM #41

Closed
wants to merge 1 commit into from
Closed

[WIP] place static vars at the end of RAM #41

wants to merge 1 commit into from

Conversation

japaric
Copy link
Member

@japaric japaric commented Oct 20, 2017

so that the stack can never collide into them

closes #34


So, this works but there's a downside: the debug information about static
variables is wrong; it says that they are located at the start of the RAM region
when they are actually located at the end of RAM.

I tried @whitequark's suggested approach about placing the static section twice
but ran into two problems:

  • ld doesn't like placing the same input section into two different output
    sections. Basically if you have something like this
._bss : {
  *(.bss);
}

.bss : {
  *(.bss);
}

._bss will have the input sections but .bss will be empty. I think this
makes sense because if duplication was allowed you will end with duplicated
symbols (i.e. the names would crash: there would be a FOO in ._bss and
another FOO in ._bss)

  • you can't discard an output section. AFAICT, /DISCARD/ only works on input
    sections.

Assuming that we are OK with losing the debug information about static variables
and landing this we still have to decide how to deal with a heap region. I think
we would have to ask the user for a _heap_size to place a fixed size heap
region at the end of RAM, before the static variables. So starting from the end
of RAM the regions would be: heap, static, stack.

cc @pftbest could you or would you be willing to adopt these changes for
msp430-rt?

so that the stack can never collide into them

closes #34
@pftbest
Copy link
Contributor

pftbest commented Oct 20, 2017

Hello, @japaric . I don't think I will port this on MSP430, because chips that I work with have peripheral registers right below the RAM region. So in case of stack overflow program will start writing garbage into peripherals and I think this is even more terrible than overwriting static variables.

@japaric
Copy link
Member Author

japaric commented Oct 20, 2017

So in case of stack overflow program will start writing garbage into peripherals and I think this is even more terrible than overwriting static variables.

That sound horrible.

the debug information about static variables is wrong

Ugh, I was so focused on getting the start of the stack and the RAM init correct that I failed to notice that not only debug info is wrong but also the operations on the static variables are wrong as well ... the operations are done assuming the static vars are allocated at the start of the RAM region.

So, this actually doesn't work. I'll have to think of some other way.

@japaric
Copy link
Member Author

japaric commented Nov 8, 2017

For anyone watching, the next thing that I'm going to try here is the double linking approach. I intend to create a linker wrapper -- let's call it swap-ld -- that will invoke arm-none-eabi-ld twice. The first time it will invoke it normally. Then swap-ld will look at the produced object file to extract the size of the .bss and .data sections. On the second arm-none-eabi-ld invocation swap-ld will pass some extra arguments to arm-none-eabi-ld to define a few symbols (_sbss, _sdata and _sstack probably). The cortex-m-rt linker script will be tweaked to use these symbols, when defined, to swap the stack and the .bss+.data section. The linker script will be modified in such a way that when using vanilla arm-none-eabi-ld you get the same memory layout as today but when you use swap-ld (e.g. you set it in .cargo/config) as a linker you get the swapped version that provides zero cost stack overflow protection.

I think that approach should work.

@whitequark
Copy link
Contributor

This seems excessively complex and very unfriendly to anyone trying to debug the build.

@japaric
Copy link
Member Author

japaric commented Nov 8, 2017

@whitequark do you have a less complex suggestion?

Note that (a) this is going to be opt-in (you have to specify the swap-ld linker to opt in); normal builds will continue to operate the same (single link step and same memory layout as today) and (b) we would control swap-ld so we can add all the debugging information we want -- we can even have it enabled by default if that's less surprising (having two link steps is unusual so it might be better to be explicit about it).

I personally think that the extra complexity is worth closing this memory safety hole but I would gladly implement this in a simpler way if I knew how. Regardless of how this is ultimately implemented, I think it's useful to experiment with the alternative memory layout early on to see if there are any other problems with it and to start thinking about how to deal with the heap.

@whitequark
Copy link
Contributor

As an optional solution it sounds good, at least in the interim while there is no better option.

japaric added a commit that referenced this pull request Nov 9, 2017
This is one possible solution to the stack overflow problem described in #34. This approach uses a
linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description
of the desired memory layout and #41 for a description of how `swap-ld` works.

The observable effects of this change in cortex-m programs are:

- the `_sbss` symbol is now override-able.
- there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be
  loaded into the program; it just exists for informative purposes (`swap-ld` uses this
  information).

Given the following program:

``` rust
fn main() {
    static mut X: u32 = 0;
    static mut Y: u32 = 1;

    loop {
        unsafe {
            ptr::write_volatile(&mut X, X + 1);
            ptr::write_volatile(&mut Y, Y + 1);
        }
    }
}
```

If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart
default, you'll get the following memory layout:

``` console
$ console
section                                                                   size         addr
.vector_table                                                            0x130    0x8000000
.text                                                                     0x94    0x8000130
.rodata                                                                    0x0    0x80001c4
.stack                                                                  0x5000   0x20000000
.bss                                                                       0x4   0x20000000
.data                                                                      0x4   0x20000004
```

Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with
the space where .bss and .data reside.

If you, instead, link this program using `swap-ld` you'll get the following memory layout:

``` console
$ arm-none-eabi-size -Ax app
section                                                                   size         addr
.vector_table                                                            0x130    0x8000000
.text                                                                     0x94    0x8000130
.rodata                                                                    0x0    0x80001c4
.stack                                                                  0x4ff8   0x20000000
.bss                                                                       0x4   0x20004ff8
.data                                                                      0x4   0x20004ffc
```

Note that no overlap exists in this case and that the call stack size has reduced to accommodate the
.bss and .data sections.

Unlike #41 the addresses of static variables is now correct:

``` console
$ arm-none-eabi-objdump -CD app
Disassembly of section .vector_table:

08000000 <_svector_table>:
 8000000:       20004ff8        strdcs  r4, [r0], -r8 ; initial Stack Pointer

08000004 <cortex_m_rt::RESET_VECTOR>:
 8000004:       08000131        stmdaeq r0, {r0, r4, r5, r8}

08000008 <EXCEPTIONS>:
 8000008:       080001bd        stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8}
 (..)

Disassembly of section .stack:

20000000 <.stack>:
        ...

Disassembly of section .bss:

20004ff8 <cortex_m_quickstart::main::X>:
20004ff8:       00000000        andeq   r0, r0, r0

Disassembly of section .data:

20004ffc <_sdata>:
20004ffc:       00000001        andeq   r0, r0, r1
```

closes #34

[swap-ld]: https://github.com/japaric/swap-ld
@japaric
Copy link
Member Author

japaric commented Nov 9, 2017

Closing in favor of #43, which actually works.

@japaric japaric closed this Nov 9, 2017
japaric added a commit that referenced this pull request Nov 25, 2017
This is one possible solution to the stack overflow problem described in #34. This approach uses a
linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description
of the desired memory layout and #41 for a description of how `swap-ld` works.

The observable effects of this change in cortex-m programs are:

- the `_sbss` symbol is now override-able.
- there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be
  loaded into the program; it just exists for informative purposes (`swap-ld` uses this
  information).

Given the following program:

``` rust
fn main() {
    static mut X: u32 = 0;
    static mut Y: u32 = 1;

    loop {
        unsafe {
            ptr::write_volatile(&mut X, X + 1);
            ptr::write_volatile(&mut Y, Y + 1);
        }
    }
}
```

If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart
default, you'll get the following memory layout:

``` console
$ console
section                                                                   size         addr
.vector_table                                                            0x130    0x8000000
.text                                                                     0x94    0x8000130
.rodata                                                                    0x0    0x80001c4
.stack                                                                  0x5000   0x20000000
.bss                                                                       0x4   0x20000000
.data                                                                      0x4   0x20000004
```

Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with
the space where .bss and .data reside.

If you, instead, link this program using `swap-ld` you'll get the following memory layout:

``` console
$ arm-none-eabi-size -Ax app
section                                                                   size         addr
.vector_table                                                            0x130    0x8000000
.text                                                                     0x94    0x8000130
.rodata                                                                    0x0    0x80001c4
.stack                                                                  0x4ff8   0x20000000
.bss                                                                       0x4   0x20004ff8
.data                                                                      0x4   0x20004ffc
```

Note that no overlap exists in this case and that the call stack size has reduced to accommodate the
.bss and .data sections.

Unlike #41 the addresses of static variables is now correct:

``` console
$ arm-none-eabi-objdump -CD app
Disassembly of section .vector_table:

08000000 <_svector_table>:
 8000000:       20004ff8        strdcs  r4, [r0], -r8 ; initial Stack Pointer

08000004 <cortex_m_rt::RESET_VECTOR>:
 8000004:       08000131        stmdaeq r0, {r0, r4, r5, r8}

08000008 <EXCEPTIONS>:
 8000008:       080001bd        stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8}
 (..)

Disassembly of section .stack:

20000000 <.stack>:
        ...

Disassembly of section .bss:

20004ff8 <cortex_m_quickstart::main::X>:
20004ff8:       00000000        andeq   r0, r0, r0

Disassembly of section .data:

20004ffc <_sdata>:
20004ffc:       00000001        andeq   r0, r0, r1
```

closes #34

[swap-ld]: https://github.com/japaric/swap-ld
japaric added a commit that referenced this pull request Feb 17, 2018
This is one possible solution to the stack overflow problem described in #34. This approach uses a
linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description
of the desired memory layout and #41 for a description of how `swap-ld` works.

The observable effects of this change in cortex-m programs are:

- the `_sbss` symbol is now override-able.
- there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be
  loaded into the program; it just exists for informative purposes (`swap-ld` uses this
  information).

Given the following program:

``` rust
fn main() {
    static mut X: u32 = 0;
    static mut Y: u32 = 1;

    loop {
        unsafe {
            ptr::write_volatile(&mut X, X + 1);
            ptr::write_volatile(&mut Y, Y + 1);
        }
    }
}
```

If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart
default, you'll get the following memory layout:

``` console
$ console
section                                                                   size         addr
.vector_table                                                            0x130    0x8000000
.text                                                                     0x94    0x8000130
.rodata                                                                    0x0    0x80001c4
.stack                                                                  0x5000   0x20000000
.bss                                                                       0x4   0x20000000
.data                                                                      0x4   0x20000004
```

Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with
the space where .bss and .data reside.

If you, instead, link this program using `swap-ld` you'll get the following memory layout:

``` console
$ arm-none-eabi-size -Ax app
section                                                                   size         addr
.vector_table                                                            0x130    0x8000000
.text                                                                     0x94    0x8000130
.rodata                                                                    0x0    0x80001c4
.stack                                                                  0x4ff8   0x20000000
.bss                                                                       0x4   0x20004ff8
.data                                                                      0x4   0x20004ffc
```

Note that no overlap exists in this case and that the call stack size has reduced to accommodate the
.bss and .data sections.

Unlike #41 the addresses of static variables is now correct:

``` console
$ arm-none-eabi-objdump -CD app
Disassembly of section .vector_table:

08000000 <_svector_table>:
 8000000:       20004ff8        strdcs  r4, [r0], -r8 ; initial Stack Pointer

08000004 <cortex_m_rt::RESET_VECTOR>:
 8000004:       08000131        stmdaeq r0, {r0, r4, r5, r8}

08000008 <EXCEPTIONS>:
 8000008:       080001bd        stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8}
 (..)

Disassembly of section .stack:

20000000 <.stack>:
        ...

Disassembly of section .bss:

20004ff8 <cortex_m_quickstart::main::X>:
20004ff8:       00000000        andeq   r0, r0, r0

Disassembly of section .data:

20004ffc <_sdata>:
20004ffc:       00000001        andeq   r0, r0, r1
```

closes #34

[swap-ld]: https://github.com/japaric/swap-ld
rukai pushed a commit to rukai/cortex-m-rt that referenced this pull request May 1, 2021
41: Bumped linked_list_allocator and added `const_mut_refs` feature r=jonas-schievink a=Spadi0

Fixes rust-embedded#40

Co-authored-by: Sam Jones <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(Zero cost) stack overflow protection
3 participants