Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnwindContext allocation cannot be reused across multiple unwinds with changing data lifetimes #701

Closed
mstange opened this issue Mar 11, 2024 · 1 comment · Fixed by #703

Comments

@mstange
Copy link
Contributor

mstange commented Mar 11, 2024

The docs for UnwindContext say:

"To avoid re-allocating the context multiple times when evaluating multiple CFI programs, it can be reused."

However, reusing the context can be tricky: If your unwind data is stored in a slice with a limited lifetime, and you use EndianSlice<'mylifetime> as the Reader, then the unwind context cannot outlive that lifetime.

Example program (does not compile):

use gimli::{BaseAddresses, EhFrame, EndianSlice, NativeEndian, UnwindContext, UnwindSection};
use std::collections::HashMap;

#[derive(Default)]
struct Unwinder {
    eh_frame_data_for_module: HashMap<String, Vec<u8>>,
    unwind_context: UnwindContext<EndianSlice<'static, NativeEndian>>,
}

impl Unwinder {
    fn add_module(&mut self, module_name: impl Into<String>, eh_frame_data: Vec<u8>) {
        self.eh_frame_data_for_module
            .insert(module_name.into(), eh_frame_data);
    }
    fn remove_module(&mut self, module_name: &str) {
        self.eh_frame_data_for_module.remove(module_name);
    }
    fn unwind_in_module(&mut self, module_name: &str, address: u64) {
        let eh_frame_data = self.eh_frame_data_for_module.get(module_name).unwrap();
        let eh_frame = EhFrame::new(&eh_frame_data, NativeEndian);
        let bases = BaseAddresses::default();
        let _unwind_info = eh_frame
            .unwind_info_for_address(
                &bases,
                &mut self.unwind_context, // ERROR: can't be used due to eh_frame_data's lifetime
                address,
                EhFrame::cie_from_offset,
            )
            .unwrap();
    }
}

fn main() {
    let mut unwinder = Unwinder::default();
    unwinder.add_module("lib1.so", vec![0, 1, 2, 3]);
    unwinder.unwind_in_module("lib1.so", 0x1234);
    unwinder.add_module("lib2.so", vec![0, 1, 2, 3]);
    unwinder.unwind_in_module("lib2.so", 0x5678);
}

The two remedies I know of are:

  1. If you do not own the data, then you have to create a new UnwindContext every time. Ideally you have some scope inside which you can do as many batched unwinds as possible, so that you can at least reuse the context to some degree.
  2. If you have (shared or full) ownership over the data, then you can create an EndianRcSlice or your own Rc-based Reader implementation which makes the data live as long as necessary.

With the Rc solution, the UnwindContext may keep your data alive for longer than you anticipated because the context is not reset at the end of each use - it's only reset at the start of the next use.

Furthermore, if you're writing an unwinding library which wraps gimli and you allow the user to provide custom OwnedData types, and your library uses EndianReader to create a custom Reader implementation which wraps the user-provided OwnedData type, then you have to take extra care to make sure that the type you give to EndianReader implements StableDeref correctly (your type must guarantee that it derefs to the same address every time even if the OwnedData type does not, or alternatively you have to put a StableDeref trait bound on your library's OwnedData trait).


It would be great if we could find a way to reuse the allocations inside UnwindContext in such a way that the same allocation can be used with EndianSlice readers of different lifetimes.

Could we add a reset_and_recycle method on UnwindContext?

    /// Tries to reuse the allocation if the layouts of `R2` and `R` match
    fn reset_and_recycle<R2: Reader>(self) -> UnwindContext<R2, A> {
        let mut ctx = UnwindContext {
            stack: self.stack.clear_and_recycle(),
            initial_rule: None,
            is_initialized: false,
        };
        ctx.reset();
        ctx
    }

Then the example above would become:

struct Unwinder {
    eh_frame_data_for_module: HashMap<String, Vec<u8>>,
    unwind_context: Option<UnwindContext<EndianSlice<'static, NativeEndian>>>, // initialized to Some()
}

impl Unwinder {
    // ...
    fn unwind_in_module(&mut self, module_name: &str, address: u64) {
        let static_context = self.unwind_context.take().unwrap();
        let eh_frame_data: &[u8] = self.eh_frame_data_for_module.get(module_name).unwrap();
        let mut lifetime_limited_context = static_context.reset_and_recycle::<EndianSlice<_, NativeEndian>>();
        let eh_frame = EhFrame::new(&eh_frame_data, NativeEndian);
        let bases = BaseAddresses::default();
        let _unwind_info = eh_frame
            .unwind_info_for_address(
                &bases,
                &mut lifetime_limited_context,
                address,
                EhFrame::cie_from_offset,
            )
            .unwrap();
        let static_context = static_context.reset_and_recycle::<EndianSlice<'static, NativeEndian>>();
        self.unwind_context = Some(static_context);
    }
}
@philipc
Copy link
Collaborator

philipc commented Mar 12, 2024

Another idea would be to stop storing R in CallFrameInstruction and RegisterRule, and instead store the offset and length of the expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants