-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editing a binary #639
Comments
It's complicated :) fq at the moment have very limited editing support for "decode values", The reason the support is limited is a mix of lack of need myself for it and that it's complex for some of the format supported by fq. For a lot of formats it's not really clear what should happen on an update, encode with same encoding but what about encoding that are ambiguous like varints can encode value in many ways with differente sizes? should size be preserved/truncated? update checksums? fields that control number of entries in an array? also fq has support for sub buffers for demuxing/tcp reasssembly etc... yeah you see :) But maybe the "clearest" would be to just support updating a specific bit/byte range using some helpers bit-size/endian helpers etc. And you can kind-of do this already using the slicing support, for example update # this assume the entry is 64 bit
$ fq '(.header.entry | tobytesrange) as $e | tobytes | [.[:$e.start],0,1,2,3,4,5,6,7,.[$e.stop:]] | tobytes | elf | .header' some_elf
# or to write it out to a file
$ fq '(.header.entry | tobytesrange) as $e | tobytes | [.[:$e.start],0,1,2,3,4,5,6,7,.[$e.stop:]] | tobytes' some_elf > changed This uses slicing, just normal
Also the difference between That said all of this can probably be improved in many ways, let me know your ideas. |
Yeah, that's really nice that you can use tobytesrange in that way -- definitely a missing recipe in the docs in the interim! A next small step would be to provide an ergonomic way to inject bytes. The ultimate vision would be to be able to update any value in any format and then propagate that change to anything else in the binary that needs to be updated to make it semantically correct. I'm guessing though that this is difficult-to-impossible, in the most extreme case requiring essentially a recompilation of the binary (imagine for my use case(s) for ELF patching, changing the length of a string, which changes the offsets of everything else in a section, suddenly all absolute addresses referring to points after that string may need changing and those new addresses might not be representable anymore with the same sequences of instructions in the binary, which would need propagating and so-on and so on). |
Yeap you describe my current plan quite well, some kind of helpers for cut/stitch and encodings. I had some momentum and motivation for a while to work on it but think i got stuck at how to make it feel jq:y and how to not "pollute" the namespace to much lots of small functions etc. One thing i've thought about is that fq has some machinery already do to query rewrites so it's possible to "extend" the jq language a bit if that would help. About ultimate visions i think you summerise the problem quite well also. You more or less end up having to writing a transmuxer, linker etc and one that should handle and preserve lots of strange things or should it "normalize"? :) I've thought some about different ways, i'll list them:
Also with any of the approaches it needs to fit well with how jq works. |
How about adding a "Big thing" TODO about binary modifications that may as well modify length? |
Yeap that is good idea, maybe can link this issue also. Could you clarify what you mean by "may as well modify length"? about if the modification changes the length of the thing being modified? |
I think I don't have a use case right now, but the idea is as following. Let's take fMP4 container. "ftyp" box contains a list of "brands". Assume adding a "brand" to this list of 4-char identifiers. This operation would change the length of the binary representation of the list. The box containing the list would also grow in size. The boxes that follow the "ftyp" box would change their position (start+=4). Basically, the idea is to allow this sort of manipulations: not just replace few bytes but also do inserts and deletes. |
Ok i see, yeah that would be nice but not sure how one would do it and i have thought about it quite a lot. For example in the mp4 case if the brands list change affect the size of the ftyp box then all boxes after it will move which in turn will most likely affect offsets in stco boxes etc and so on if it should still be playable. So to support that kind of thing my guess is that one would have to write an encoder per format that want to support it (nearly a mp4 muxer in this case). But there are other issues and ambiguities encoding creates also, should an encoder try to "preserve" number, string etc encodings that can encode the same value in multiple ways? or normalize? (varint for example), would assign to a field that has symbolic mapping do reverse map back? lots of questions :) |
Agree. Likely the whole thing would look like "manual muxer" specific for every format. Specificity in not an issue per se as every format is already custom. I was thinking about e.g. sidx box. In the above case it would become invalid, so one would need either to manually patch its values, or |
Yeap collect use cases sounds good. I've mostly used the technique i mentioned in an comment above to stitch things together, i wonder if one would come up with some helper function(s) to make that easier |
Hi,
With
jq
I'm used to being able to edit a json document in-situ:I figure you should be able to do the same sort of thing with fq with binaries, but if so, it's not documented how to do it exactly.
For example:
... at the moment this prints out a json representation of the elf. The closest thing I saw in the documentation was the "Add query parameter to URL" in the screenshot at the beginning of the readme, but of course there is no toelf.
tobytes
reports:fq: value can't be a binary
.So, is there a way to edit fields and then dump it out in the original format, or is this not possible currently? Thanks!
The text was updated successfully, but these errors were encountered: