-
Notifications
You must be signed in to change notification settings - Fork 962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to Capstone release 4.0.2 with patch #1086
Conversation
@PeterMatula, please review this change. This change causes around 50 test failures in the regression tests. I will be examining the changes to see if they are really regressions, or just differences, or possibly improvements. RetDec does build successfully with this Capstone. I had also experimented with some of the Capstone release candidates (5.0-rc1 and 5.0-rc2, as well as the head of the "next" branch, where current development is being done) but retdec had compilation errors using those versions. |
@PeterMatula, as an alternative to doing a full upgrade to the 4.0.2 release, I experimented with forking capstone and just cherry-picking the endbr* support onto the version of capstone that RetDec is currently using (See richardlford/capstone@a6ee24f). I then modified RetDec to use that capstone and also added code to treat the endbr* instructions as NOPs (i.e. producing no LLVM). The result passes all of the RetDec regression tests. If you don't want to upgrade to the full 4.0.2 release of capstone you could consider having your own fork of capstone and apply the endbr* support change as I did. Of course, it seems preferable to try to keep up with capstone otherwise RetDec is likely to have continuing problems with newer binaries. Thanks for your help. |
Capstone version 4.0.2 has a bug when disassembling a powerpc instruction with a signed 16-bit immediate. See capstone-engine/capstone#1746 and capstone-engine/capstone#1746 (comment). This change adds to the capstone patch to fix this problem.
Capstone version 4.0.2 has a bug when disassembling a powerpc instruction with a signed 16-bit immediate. See capstone-engine/capstone#1746 and capstone-engine/capstone#1746 (comment). RetDec PR avast/retdec#1086 moved RetDec up to version 4.0.2 of Capstone, but adds a patch to fix this bug. The fix includes not printing the immediates in hex (a later version of Capstone that fixes this problem no long prints the immediates in hex). This change fixes a test to agree with the new behavior.
@PeterMatula, I have pushed another commit that patches a bug in capstone 4.0.2. In addition to fixing the bug, the patch also now always prints the 16-bit immediates in decimal (this is the behavior in a later version of capstone that has this bug fixed). With this patch, all of the RetDec regression tests passed except one that was expecting the immediates to be printed in hex. I have submitted a separate PR (avast/retdec-regression-tests#118) to fix that test, so you will want to merge that PR after this one. Could you please review it? Thanks |
These test binaries contain endbr32 and endbr64 instructions, so this PR should be committed after RetDec has moved to Capstone 4.0.2 (avast/retdec#1086).
Thanks for this, I will definitely look at it, as I planned to update Capstone for a long time but didn't get around to it myself. I'm not against updating Capstone even if it causes some tests to fail - we will either fix them if possible or disable them. It is not good to remain in a state of the old Capstone version just for the sake of tests. Some time ago I also experimented with Capstone v5 and there is also #1059 PR, but so far I didn't manage to get the tests to some reasonable state with the new Capstone. I would prefer to use the latest Capstone possible, but I'm not sure what is the state of Capstone project, if the |
@PeterMatula, the 4.0.2 version is the latest And the 4.0.2 version supports the endbr32 and endbr64 instructions. This is essential to handle some newer binaries. So I'd appreciate it if you would go ahead and merge this. In studying Capstone's use of the llvm-mc infrastructure, it occurred to me that RetDec, with some effort, might be able to eliminate the use of Capstone and just use LLVM's decoder directly, i.e. it McInsts. But that would require further study. But LLVM is very active and more likely to have up-to-date instructions. |
…ncorrect range exception message
Calls to dynamically-linked functions go through the procedure linkage table (PLT). RetDec turns a PLT entry into a function, say malloc@plt, that appears to do nothing but call the external function, say malloc (though the assembly code will do a jump rather than a call). User code that logically wants to call malloc instead calls malloc@plt (and sets up arguments as if calling malloc). The malloc@plt code first jumps to the dynamic linker which modifies it so that subsequent calls to malloc@plt will jump directly to malloc. We say that malloc@plt wraps malloc. The call to malloc in malloc@plt will not have any arguments setup, so malloc will appear to have no parameters or returns (unless that information is provided by link-time-information, debug information, or name demangling), but it needs to have the same parameter types and return type as malloc@plt. The propagateWrapped methods copy the argument information from the DataFlowEntry of the wrapping function to the wrapped function. Then, when the calls to the wrapping function are inlined (in connectWrappers), effectively the call to the wrapping function is changed into a call to the wrapped function. The motivation for this change is the programs that analyze the output of RetDec (either the C code, or the LLVM code) want to recognize library functions and treat them specially. This change makes it so that the library function names are used directly (rather than the plt version) and they are passed their parameters correctly.
As Capstone was updated, the fix in capstone-engine/capstone#968 took effect and the original RetDec fix is not needed - in fact, it caused problems.
@richardlford thanks a lot for this, you did what I was putting off for a long time. Especially that Capstone patch is not an easy thing to figure out. I did one more change in 2771f85 as this code was causing problems once the issue in Capstone was fixed. |
Capstone version 4.0.2 has a bug when disassembling a powerpc instruction with a signed 16-bit immediate. See capstone-engine/capstone#1746 and capstone-engine/capstone#1746 (comment). RetDec PR avast/retdec#1086 moved RetDec up to version 4.0.2 of Capstone, but adds a patch to fix this bug. The fix includes not printing the immediates in hex (a later version of Capstone that fixes this problem no long prints the immediates in hex). This change fixes a test to agree with the new behavior.
Damn, by accident I did Rebase and merge which was not the best option for this PR. Hope you don't mind. I can try to revert it and do it again, but I'm afraid that would only further complicate things. |
These test binaries contain endbr32 and endbr64 instructions, so this PR should be committed after RetDec has moved to Capstone 4.0.2 (avast/retdec#1086).
@PeterMatula, No problem. You can leave it as it is. |
This is the latest released version of Capstone and the first Capstone release to support the endbr32 and endbr64 instructions. I was trying to add some new tests for RetDec, but although I can use an option to avoid producing the endbr instructions, the library has them so they end up in the produced binary. Thus if we want to add any new tests (which I do), we need to update to a newer Capstone.