-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elf: Ignore bad PT_INTERP rather than completely failing #192
Conversation
bad way when manipulating the ELF file (for example, https://bugzilla.redhat.com/show_bug.cgi?id=1770513 where splitting debug info into a separate file leaves a bogus PT_INTERP header). When using `gimli` with `object` to parse DWARF info, the PT_INTERP is irrelevent so failing to load the file because of it causes loss of functionality. This PR just ignores the PT_INTERP if it can't be parsed/loaded/used. Another option would be to keep it as an `Option<Result<_, _>>` so that any errors can be preserved but deferred until someone tries to use the interp information.
Isn’t the executable fundamentally malformed then if the dynamic linker field is malformed? |
I guess we could change this but it’s a pretty big change semantically, also technically backwards incompatible but not too bad. What do other people think ? |
Yes, but it isn't actually an executable - it's split debuginfo made with |
Isn't that clearly just a bug in objcopy then? We can of course support things like that, I'm just not sure it's the right thing to do, supporting bugs of other tools; e.g., where does it it stop? To be clear, in this case I'm inclined to merge this because failing the entire parse over a bad interpreter isn't necessarily reasonable, and it's a pretty minor change, but on the other hand, it is also validating a pretty serious bug in whatever generates the file in the first place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argghh sorry, I thought this change was making the Elf
interpreter field String
-> Option<String>
; then I was like wait I thought we did that like years ago.
Sorry about this; yes I still think this is a bug in objcopy, and we should file over there, but your change is good and appropriate, sorry for the confusion!
ok published new |
Thanks for the quick turnaround! I have filed a bug against Fedora's objcopy (since that's where I saw it), but I haven't worked out if its a generic objcopy bug (or maybe its already been fixed). |
For what it's worth, I think often you really want to be able to work with (mildly) malformed executables. It doesn't matter who's to blame for the file not being up to spec when you just want to see its contents. |
A large part of the problem is that |
Yes I think lazy parsing is desirable. I was experimenting and even wrote a crate for a totally generic, parallel lazy data transducer/iterator, though since most of the goblin iterators are hand written and since impl trait it’s specific usecases for goblin are diminished (though I have found some applications very handy elsewhere): https://crates.io/crates/lazy_transducer As for other lazy aspects I think remaining are just some strings and vecs... vecs could easily be replaced with lazy transducer in an hour (probably less). Or custom iterator, or a thunk, or remove the field and just add an accessor method. I think someone just needs to do the work. I am somewhat skeptical of usecases for this though; this PR was definitely reasonable, but in general a malformed binary is going to end very badly and people should know sooner rather than later. In my experience almost all aspects of binaries are interconnected and flaws in one area will cause another to not work. The only time I was kind of annoyed by goblin parsing everything was when writing faerie and outputting very malformed binaries; but running with RUST_LOG=debug helped me through, and it’s also a niche, finite, small limited work period (outputting malformed to non malformed) that I could see arguments being made to not be super lazy. I’m hoping there’s a middle space, not completely lazy, but not totally eager either. That being said, performance implications are definitely something worth thinking about, but we have no numbers (even on general performance!) so it’s hard to say. I did some “at desk” perf stuff and I only saw very visible differences with lazy transducer (paralllel iterator) for symbols and relocs, versus not, on an absolutely monster binary, I think it had 10million symbols. But that is again very niche. My guess is all parsing time will be always dominated by debug symbol parsing (when they’re present). Anyway, to summarize:
Does that sound good ? :) |
Goblin is useful and works well as-is. I'd be inclined to do small doc/API fixups (if any) and then call it 1.0. A full lazy/streaming/incremental API change can be 2.0. (FWIW: https://tonyarcieri.com/rust-2020-towards-a-1-0-crate-ecosystem) |
Haha I knew someone was going to post that blog post sometime ;) Yea so I agree; probably punting 1.0 for lazy goblin will just mean it doesn’t hit 1.0 for another 6months 1 year. I think that I want scroll to be 1.0, which probably the last remaining piece is willglyns PR which makes the writers take reference instead of owned version. This makes writing of large objects only inside of a box possible, among other things. There were some potential ergonomic issues when using it I wanted to explore but I haven’t had time and the PR has bit rotted. Anyway if I (or another Hero) can’t find the time we can just talk about releasing 1.0 as is, as you suggest |
It seems some tools leave a PT_INTERP Phdr in a
bad way when manipulating the ELF file (for example,
https://bugzilla.redhat.com/show_bug.cgi?id=1770513 where splitting debug
info into a separate file leaves a bogus PT_INTERP header). When using
gimli
withobject
to parse DWARF info, the PT_INTERP is irreleventso failing to load the file because of it causes loss of functionality.
This PR just ignores the PT_INTERP if it can't be
parsed/loaded/used. Another option would be to keep it as an
Option<Result<_, _>>
so that any errors can be preserved but deferreduntil someone tries to use the interp information.