Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elf: Ignore bad PT_INTERP rather than completely failing #192

Merged
merged 1 commit into from
Nov 10, 2019

Conversation

jsgf
Copy link
Contributor

@jsgf jsgf commented Nov 10, 2019

It seems some tools leave a PT_INTERP Phdr in a
bad way when manipulating the ELF file (for example,
https://bugzilla.redhat.com/show_bug.cgi?id=1770513 where splitting debug
info into a separate file leaves a bogus PT_INTERP header). When using
gimli with object to parse DWARF info, the PT_INTERP is irrelevent
so failing to load the file because of it causes loss of functionality.

This PR just ignores the PT_INTERP if it can't be
parsed/loaded/used. Another option would be to keep it as an
Option<Result<_, _>> so that any errors can be preserved but deferred
until someone tries to use the interp information.

bad way when manipulating the ELF file (for example,
https://bugzilla.redhat.com/show_bug.cgi?id=1770513 where splitting debug
info into a separate file leaves a bogus PT_INTERP header). When using
`gimli` with `object` to parse DWARF info, the PT_INTERP is irrelevent
so failing to load the file because of it causes loss of functionality.

This PR just ignores the PT_INTERP if it can't be
parsed/loaded/used. Another option would be to keep it as an
`Option<Result<_, _>>` so that any errors can be preserved but deferred
until someone tries to use the interp information.
@jsgf jsgf changed the title It seems some tools leave a PT_INTERP Phdr in a Ignore bad PT_INTERP rather than completely failing Nov 10, 2019
@jsgf jsgf changed the title Ignore bad PT_INTERP rather than completely failing Elf: Ignore bad PT_INTERP rather than completely failing Nov 10, 2019
@m4b
Copy link
Owner

m4b commented Nov 10, 2019

Isn’t the executable fundamentally malformed then if the dynamic linker field is malformed?

@m4b
Copy link
Owner

m4b commented Nov 10, 2019

I guess we could change this but it’s a pretty big change semantically, also technically backwards incompatible but not too bad. What do other people think ?

@jsgf
Copy link
Contributor Author

jsgf commented Nov 10, 2019

Isn’t the executable fundamentally malformed then if the dynamic linker field is malformed?

Yes, but it isn't actually an executable - it's split debuginfo made with objcopy --only-keep-debug foo foo.debug so the phdrs are actually irrelevant. Only the debug sections are meaningful.

@m4b
Copy link
Owner

m4b commented Nov 10, 2019

Isn't that clearly just a bug in objcopy then? We can of course support things like that, I'm just not sure it's the right thing to do, supporting bugs of other tools; e.g., where does it it stop?

To be clear, in this case I'm inclined to merge this because failing the entire parse over a bad interpreter isn't necessarily reasonable, and it's a pretty minor change, but on the other hand, it is also validating a pretty serious bug in whatever generates the file in the first place.

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argghh sorry, I thought this change was making the Elf interpreter field String -> Option<String>; then I was like wait I thought we did that like years ago.

Sorry about this; yes I still think this is a bug in objcopy, and we should file over there, but your change is good and appropriate, sorry for the confusion!

@m4b m4b merged commit fbcd7c2 into m4b:master Nov 10, 2019
@m4b
Copy link
Owner

m4b commented Nov 10, 2019

ok published new 0.1.1, sorry for the confusion, and thanks for the PR! :D

@jsgf
Copy link
Contributor Author

jsgf commented Nov 10, 2019

Thanks for the quick turnaround! I have filed a bug against Fedora's objcopy (since that's where I saw it), but I haven't worked out if its a generic objcopy bug (or maybe its already been fixed).

@jsgf jsgf deleted the fix-bad-interp branch November 10, 2019 20:24
@le-jzr
Copy link
Contributor

le-jzr commented Nov 11, 2019

For what it's worth, I think often you really want to be able to work with (mildly) malformed executables. It doesn't matter who's to blame for the file not being up to spec when you just want to see its contents.

@philipc
Copy link
Collaborator

philipc commented Nov 11, 2019

A large part of the problem is that goblin tries to parse many things up front, even if the user doesn't need them. It would be better if there was a finer grained and lazy API, so that parsing errors are deferred until the user requests parsing of a specific field or structure, and the user can choose how to handle the error then. That would be a large change to goblin though. I've said this before in #120 (comment). For the purposes of the object crate's use of goblin, this is definitely something I want to fix, but I don't currently have the time for it.

@m4b
Copy link
Owner

m4b commented Nov 11, 2019

Yes I think lazy parsing is desirable. I was experimenting and even wrote a crate for a totally generic, parallel lazy data transducer/iterator, though since most of the goblin iterators are hand written and since impl trait it’s specific usecases for goblin are diminished (though I have found some applications very handy elsewhere):

https://crates.io/crates/lazy_transducer
Example PR: #69

As for other lazy aspects I think remaining are just some strings and vecs... vecs could easily be replaced with lazy transducer in an hour (probably less). Or custom iterator, or a thunk, or remove the field and just add an accessor method.

I think someone just needs to do the work.

I am somewhat skeptical of usecases for this though; this PR was definitely reasonable, but in general a malformed binary is going to end very badly and people should know sooner rather than later. In my experience almost all aspects of binaries are interconnected and flaws in one area will cause another to not work.

The only time I was kind of annoyed by goblin parsing everything was when writing faerie and outputting very malformed binaries; but running with RUST_LOG=debug helped me through, and it’s also a niche, finite, small limited work period (outputting malformed to non malformed) that I could see arguments being made to not be super lazy.

I’m hoping there’s a middle space, not completely lazy, but not totally eager either.

That being said, performance implications are definitely something worth thinking about, but we have no numbers (even on general performance!) so it’s hard to say.

I did some “at desk” perf stuff and I only saw very visible differences with lazy transducer (paralllel iterator) for symbols and relocs, versus not, on an absolutely monster binary, I think it had 10million symbols.

But that is again very niche. My guess is all parsing time will be always dominated by debug symbol parsing (when they’re present).

Anyway, to summarize:

  1. Yes I’m interested in more lazy goblin
  2. We should probably figure this out before 1.0
  3. Someone should present a survey of what they expect this to look like
  4. Id like some hard performance numbers before making any large sweeping changes.

Does that sound good ? :)

@jsgf
Copy link
Contributor Author

jsgf commented Nov 11, 2019

We should probably figure this out before 1.0

Goblin is useful and works well as-is. I'd be inclined to do small doc/API fixups (if any) and then call it 1.0. A full lazy/streaming/incremental API change can be 2.0. (FWIW: https://tonyarcieri.com/rust-2020-towards-a-1-0-crate-ecosystem)

@m4b
Copy link
Owner

m4b commented Nov 12, 2019

Haha I knew someone was going to post that blog post sometime ;)

Yea so I agree; probably punting 1.0 for lazy goblin will just mean it doesn’t hit 1.0 for another 6months 1 year.

I think that I want scroll to be 1.0, which probably the last remaining piece is willglyns PR which makes the writers take reference instead of owned version. This makes writing of large objects only inside of a box possible, among other things.

There were some potential ergonomic issues when using it I wanted to explore but I haven’t had time and the PR has bit rotted.

Anyway if I (or another Hero) can’t find the time we can just talk about releasing 1.0 as is, as you suggest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants