Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tls export symbols #546

Merged
merged 14 commits into from
Mar 15, 2023
Merged

Tls export symbols #546

merged 14 commits into from
Mar 15, 2023

Conversation

palainp
Copy link
Contributor

@palainp palainp commented Feb 7, 2023

This PR is an follow-up of the previous TLS PR #542 . I'm currently not sure if the exported API is clear enough or makes sense at all, I'd feel better if someone could comment that.

Now we can export TDATA as the address of the the thread variables' initialization values, LTDATA the length of the tdata section, and LTBSS the length of the tbss section.
This way, we can allocate, for each thread, an area of memory (the size is calculated with solo5_tls_size()), set the tp pointer to its value with solo5_tls_tp_offset(.), and copy the initial value of tdata into our new memory solo5_tls_data_offset(.). As an example, we can do:

void* tcb1;                                      // defines a TLS block
tcb1 = calloc(solo5_tls_size(), sizeof(char));   // gets memory for it, set tbss to 0
memcpy(solo5_tls_data_offset(tcb1), TDATA, LTDATA);       // copy tdata
solo5_set_tls_base(solo5_tls_tp_offset((uintptr_t)tcb1);  // sets tp ptr

I've come across 2 bugs, the first is that if tdata is not a PT_LOAD section, it is not loaded by the tender, so the initial values of tdata are not copied into memory, the second is that memcpy can be inlined and cause errors in our TLS use case (I haven't investigated this much so far, I'm going to try it out with ocaml-solo5 to see if this could be a bigger issue).

@palainp palainp marked this pull request as draft February 7, 2023 14:06
@palainp
Copy link
Contributor Author

palainp commented Feb 7, 2023

Oups it seems that something is wrong between my laptop and CI, I'll check that and update. Sorry :/

@palainp
Copy link
Contributor Author

palainp commented Feb 7, 2023

The virtio failure (probably the same for xen) will be a bit harder to fix.

The multiboot headers, in bindings/{virtio,xen}/boot.S, defined as https://www.gnu.org/software/grub/manual/multiboot/multiboot.html#Header-address-fields, should explicit the end of the tdata segment (tdata have to be copied too).
I tried to fix that but I don't understand yet how is calculated the offset from %fs for both virtio and xen (it's very much not the same as for the other targets, this may be an error in the linker script again).

@palainp
Copy link
Contributor Author

palainp commented Feb 8, 2023

Update: I'm starting to understand how ld and lld work with TLS. So far it seems that ld uses the address in the elf binary, and lld uses the memsize addresses, so with a linker script like :

...
    .tdata { *(.tdata) } :tdata
    . = ALIGN(CONSTANT(MAXPAGESIZE));
    _edata = .;
    .tbss { *(.tbss) } :tbss
...

We have with readelf -lW:

Elf file type is EXEC (Executable file)
Entry point 0x100020
There are 7 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
...
  TLS            0x005018 0x0000000000104018 0x0000000000104018 0x000008 0x011068 R   0x20
...

FileSiz is the actual size of the tdata section in the elf (1 8B value in tdata as tbss should be set to 0 at runtime), MemSiz is the size of the sections in memory (not sure yet how this is calculated).
We have an offset :

  • about 1 page from %fs for tdata (because we have an ALIGN directive in the linker script, it was mandatory to set the _edata symbol), and enough bytes for the %fs tbss thread variables for the tbss section with ld
  • a MemSiz offset for tdata, and a (MemSiz - pagesize, because ALIGN) for the tbss variables with lld

I'll try to figure out if I can give an option to ld or lld to force similar behaviour regarding the offsets.

section which is not a PT_LOAD but a PT_TLS section (and pass it in
the case where there is no __thread variables (p_filesz==0))
- align correctly the tdata section (and set it read-only) to match
the expectation of the tenders elf reader
- add a bss section in the elf header as lld can merge that section
with the previous and calculate TLS offsets badly
- now tbss and bss have no RWX permissions (this is important for
tbss to be distinct to tdata and not loaded by the elf reader)
@palainp
Copy link
Contributor Author

palainp commented Feb 8, 2023

To my understanding, the big memsize in the readelf output was due to bss being merged by lld with the previous sections. So I added a bss section in the elf header which solve that part.
Now I have something more comprehensible:

...
  LOAD           0x001000 0x0000000000100000 0x0000000000100000 0x003f44 0x003f44 R E 0x1000 # <- this is text
  LOAD           0x005000 0x0000000000104000 0x0000000000104000 0x000018 0x000018 RW  0x1000 # <- this is data
  TLS            0x006000 0x0000000000105000 0x0000000000105000 0x001000 0x001000 R   0x8 # <- this is tdata
  TLS            0x005018 0x0000000000106000 0x0000000000106000 0x000000 0x000008     0x8 # <- this is tbss
  NULL           0x000000 0x0000000000106000 0x0000000000106000 0x000000 0x010080     0x20 # <- this is bss
...

It works on my linux laptop with ld.lld but still fails with {Free,Open}BSD CI. I'll try to re-install a BSD on a spare laptop to understand what is the failure with CI.

…instead of macros for TLS

- update stubs with the new API
- update test_tls.c with the new API
@palainp
Copy link
Contributor Author

palainp commented Feb 11, 2023

Before the last commits, I didn't set any value for what was (before this PR) the tp field in struct tcb (e.g. nothing like tcb1.tp = &tcb1.tp). I just set the TCB address to a correct value with solo5_set_tls_base, and it seems that linux is ok with that (having an address for its %fs register), but BSD does an indirection on that address and so has %fs set to 0 (which explains the page fault).

That different behavior seems peculiar to me, I'll search the internet for something about that.

Now I also use some function to have compiler type checking and hiding the TCB/TLS details outside solo5, but it force me to have some casts in tls.c between void* and uintptr_t.

bindings/tls.c Outdated Show resolved Hide resolved
@palainp
Copy link
Contributor Author

palainp commented Feb 22, 2023

The last thing I'm worried about is the need for . = ALIGN(CONSTANT(MAXPAGESIZE)); in the .tdata section. I haven't found how to push the alignment out of the section yet, this is due to some tests in

and later (I haven't kept notes on which tests failed exactly :/ ).

However the consequences are limited to a slightly oversized binary (1 page max) as well for the memcpy call when copying .tdata into the TLS thread.

@palainp palainp marked this pull request as ready for review February 23, 2023 15:43
bindings/tls.c Outdated Show resolved Hide resolved
include/solo5.h Outdated Show resolved Hide resolved
@hannesm
Copy link
Contributor

hannesm commented Mar 14, 2023

Thank you for this line of work. I reviewed the code, and think it is in good shape -- i.e. the minimal effort to get this up and running :)

To me, it looks like while this extends the solo5 API and codebase slightly, it adds support for thread local storage (which is required for OCaml 5). The changes to the linker scripts and C codebase look fine to me, I really appreciate that the test case (test_tls) has been reformulated to use the freshly developed API.

From an attacker point of view, I was curious about tls_init -- so it receives a pointer and will memcpy some memory starting from that pointer address. But I couldn't figure what a nicer API would be -- the caller can't provide a proof that starting at the pointer there will be sufficient memory (of tls_size) available.

So, thanks again for this work. I'd be happy if this would be merged and released as part of the next solo5 release :)

@palainp
Copy link
Contributor Author

palainp commented Mar 14, 2023

Thank you very much for your comment. I've updated the indentation and the comment in solo5.h.

Right now, the API update shouldn't break anything, but you're right, there is now a piece of code that tries to write somewhere without checking that the destination size is sufficient. To me the only way around to solve this potential issue seems to bring back memory allocation into solo5 to keep control over the malloc(solo5_tls_size()) call.

And if it is merged and released, it will be possible to update ocaml-solo5 to use the new API as in test_tls.c

@palainp
Copy link
Contributor Author

palainp commented Mar 14, 2023

I was also thinking of another possibility which might be to export the address and length of .tdata plus an offset into the thread memory and delegate the memcpy to the caller. As solo5 does not use the __thread variable, this should be safe.

@dinosaure
Copy link
Collaborator

I am indeed inclined to merge this PR and do a release. The comment from @hannesm confirms a double code review and I'm a bit more confident about the next step. Regarding the security aspect, the idea of wrapping the area with READ_ONLY segments is not possible?

@dinosaure
Copy link
Collaborator

From a discussion on the MirageOS meeting (which will be public on the MirageOS mailing list), we agreed that the PR is ready to be merge from different point of views. Thanks for your work @palainp, it paves the way for the support of OCaml 5.

@dinosaure dinosaure merged commit 929fd31 into Solo5:master Mar 15, 2023
@palainp palainp deleted the tls-export-symbols branch March 15, 2023 13:58
@dinosaure dinosaure mentioned this pull request Mar 22, 2023
@Kensan Kensan mentioned this pull request Apr 16, 2023
dinosaure added a commit that referenced this pull request Apr 25, 2023
According to #542 & #546, we change the memory layout of unikernels (and added
sections needed for the TLS support). We must assert an incompatibility between
solo5.0.7.* and solo5.0.8.0 and be sure that old tenders will not be used with
new built unikernels.

Co-authored-by: Hannes Mehnert <[email protected]>
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Apr 25, 2023
* Be able to build `spt`, `virtio`, `muen` and `xen` targets on OpenBSD
  (@adamsteen, Solo5/solo5#544). This change does not allow us to "run" these
  targets on OpenBSD
* Fix linker scripts with TLS (Thread Local Storage) sections
  (@palainp, @hannesm, @dinosaure, Solo5/solo5#542)
* Export TLS symbols (@palainp, @hannesm, @dinosaure, Solo5/solo5#546)
  **breaking change** due to Solo5/solo5#542 & Solo5/solo5#546, tenders must be
  **upgraded**. Indeed, solo5.0.7.* tenders will not be able to load correctly
  unikernels compiled with solo5.0.8.0. The internal ABI version for
  `solo5-hvt`/`solo5-spt` was upgraded accordingly.

  This version implements Thread Local Storage. The user can initialise a TLS
  block with `solo5_tls_init` on a pointer to `solo5_tls_size()` free bytes.
  Then, the user is able to set the `tp` (Thread Pointer) pointer via
  `solo5_set_tls_base(solo5_tls_tp_offset(tls_block))`. More details are
  available into `solo5.h`.

  Note: this change does not allow a Solo5 unikernel to use multiple cores! It
  only provides an API required by OCaml 5 / pthread to launch, at most, one
  thread.
* Fix tests reported by NixOS (@greydot, @ehmry, Solo5/solo5#547)
* Split out the `time.c` implementation between Muen and HVT
  (@dinosaure, @Kensan, Solo5/solo5#552)
* User hypercall instead of TSC-based clock when the user asks for the
  wall-clock (@dinosaure, @reynir, Solo5/solo5#549, Solo5/solo5#550)

  Note: only hvt & virtio are updated to avoid a clock drift on the wall-clock.
  Indeed, when the unikernel is suspended, the wall-clock is not updated. Muen
  & Xen still use a TSC-based wall-clock. The spt target was already in sync
  with the host's wall-clock.
* Improve the muen clock (@Kensan, Solo5/solo5#553)
* Fix the `.bss` section according to Solo5/solo5#542 & Solo5/solo5#546. The
  `.bss` section is tagged with `PT_LOAD`. Tenders are available to load this
  section properly. (@Kensan, @dinosaure, Solo5/solo5#551, Solo5/solo5#554)
* Fix the cross-compilation of Solo5 for `aarch64`
  (@dinosaure, @palainp, @hannes, Solo5/solo5#555)
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Apr 25, 2023
  (@adamsteen, Solo5/solo5#544). This change does not allow us to "run" these
  targets on OpenBSD
* Fix linker scripts with TLS (Thread Local Storage) sections
  (@palainp, @hannesm, @dinosaure, Solo5/solo5#542)
* Export TLS symbols (@palainp, @hannesm, @dinosaure, Solo5/solo5#546)
  **breaking change** due to Solo5/solo5#542 & Solo5/solo5#546, tenders must be
  **upgraded**. Indeed, solo5.0.7.* tenders will not be able to load correctly
  unikernels compiled with solo5.0.8.0. The internal ABI version for
  `solo5-hvt`/`solo5-spt` was upgraded accordingly.

  This version implements Thread Local Storage. The user can initialise a TLS
  block with `solo5_tls_init` on a pointer to `solo5_tls_size()` free bytes.
  Then, the user is able to set the `tp` (Thread Pointer) pointer via
  `solo5_set_tls_base(solo5_tls_tp_offset(tls_block))`. More details are
  available into `solo5.h`.

  Note: this change does not allow a Solo5 unikernel to use multiple cores! It
  only provides an API required by OCaml 5 / pthread to launch, at most, one
  thread.
* Fix tests reported by NixOS (@greydot, @ehmry, Solo5/solo5#547)
* Split out the `time.c` implementation between Muen and HVT
  (@dinosaure, @Kensan, Solo5/solo5#552)
* User hypercall instead of TSC-based clock when the user asks for the
  wall-clock (@dinosaure, @reynir, Solo5/solo5#549, Solo5/solo5#550)

  Note: only hvt & virtio are updated to avoid a clock drift on the wall-clock.
  Indeed, when the unikernel is suspended, the wall-clock is not updated. Muen
  & Xen still use a TSC-based wall-clock. The spt target was already in sync
  with the host's wall-clock.
* Improve the muen clock (@Kensan, Solo5/solo5#553)
* Fix the `.bss` section according to Solo5/solo5#542 & Solo5/solo5#546. The
  `.bss` section is tagged with `PT_LOAD`. Tenders are available to load this
  section properly. (@Kensan, @dinosaure, Solo5/solo5#551, Solo5/solo5#554)
* Fix the cross-compilation of Solo5 for `aarch64`
  (@dinosaure, @palainp, @hannes, Solo5/solo5#555)
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Apr 28, 2023
* Be able to build `spt`, `virtio`, `muen` and `xen` targets on OpenBSD
  (@adamsteen, Solo5/solo5#544). This change does not allow us to "run" these
  targets on OpenBSD
* Fix linker scripts with TLS (Thread Local Storage) sections
  (@palainp, @hannesm, @dinosaure, Solo5/solo5#542)
* Export TLS symbols (@palainp, @hannesm, @dinosaure, Solo5/solo5#546)
  **breaking change** due to Solo5/solo5#542 & Solo5/solo5#546, tenders must be
  **upgraded**. Indeed, solo5.0.7.* tenders will not be able to load correctly
  unikernels compiled with solo5.0.8.0. The internal ABI version for
  `solo5-hvt`/`solo5-spt` was upgraded accordingly.

  This version implements Thread Local Storage. The user can initialise a TLS
  block with `solo5_tls_init` on a pointer to `solo5_tls_size()` free bytes.
  Then, the user is able to set the `tp` (Thread Pointer) pointer via
  `solo5_set_tls_base(solo5_tls_tp_offset(tls_block))`. More details are
  available into `solo5.h`.

  Note: this change does not allow a Solo5 unikernel to use multiple cores! It
  only provides an API required by OCaml 5 / pthread to launch, at most, one
  thread.
* Fix tests reported by NixOS (@greydot, @ehmry, Solo5/solo5#547)
* Split out the `time.c` implementation between Muen and HVT
  (@dinosaure, @Kensan, Solo5/solo5#552)
* User hypercall instead of TSC-based clock when the user asks for the
  wall-clock (@dinosaure, @reynir, Solo5/solo5#549, Solo5/solo5#550)

  Note: only hvt & virtio are updated to avoid a clock drift on the wall-clock.
  Indeed, when the unikernel is suspended, the wall-clock is not updated. Muen
  & Xen still use a TSC-based wall-clock. The spt target was already in sync
  with the host's wall-clock.
* Improve the muen clock (@Kensan, Solo5/solo5#553)
* Fix the `.bss` section according to Solo5/solo5#542 & Solo5/solo5#546. The
  `.bss` section is tagged with `PT_LOAD`. Tenders are available to load this
  section properly. (@Kensan, @dinosaure, Solo5/solo5#551, Solo5/solo5#554)
* Fix the cross-compilation of Solo5 for `aarch64`
  (@dinosaure, @palainp, @hannesm, Solo5/solo5#555)
* Increase the Muen ABI (2 to 3) due to TLS changes (@Kensan, ocaml#557)
* Support lifecycle management for Muen (@Kensan, ocaml#557)
  The user is able to configure automatic restarting of unikernels that invokes
  `solo5_ext()`
* Fix the `test_fpu` test & ensure the alignment of variables (@Kensan, ocaml#557)
dinosaure added a commit that referenced this pull request Apr 28, 2023
According to #542 & #546, we change the memory layout of unikernels (and added
sections needed for the TLS support). We must assert an incompatibility between
solo5.0.7.* and solo5.0.8.0 and be sure that old tenders will not be used with
new built unikernels.

Co-authored-by: Hannes Mehnert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants