-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
polkavm-linker: Swap origin/target in ADD32/SUB32 #260
Conversation
Relocation for ADD32/SUB32 relocation pair has origin/target wrong order. Swap origin and target. Signed-off-by: Jarkko Sakkinen <[email protected]>
Is this due the fix (candidate) or something else:
|
Closes: #247 |
[(_, Kind::Mut(MutOp::Add, RelocationSize::U32, target_1)), (_, Kind::Mut(MutOp::Sub, RelocationSize::U32, target_2))] => { | ||
relocations.insert( | ||
current_location, | ||
RelocationKind::Offset { | ||
origin: *target_1, | ||
target: *target_2, | ||
size: SizeRelocationSize::Generic(RelocationSize::U32), | ||
}, | ||
); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically this isn't correct, so it's most likely not a proper fix (as evidenced by the failing tests).
We have RelocationKind::Offset { origin, target, ... }
, so in origin
we should have the base from which the offset is calculated, and in target
we should have the destination, so logically this should match a relocation that looks like: offset = target - origin
. However what you're doing here is the other way around: offset = origin - target
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically this isn't correct, so it's most likely not a proper fix (as evidenced by the failing tests).
@koute, a snippet from your comment in the original issue:
0: R_RISCV_ADD32 .Lanon.cb76ee1d57f0804fce1a80e99f7c73f1.1
0: R_RISCV_SUB32 .Lswitch.table._ZN57_$LT$program_for_bug..Foo$u20$as$u20$core..fmt..Debug$GT$3fmt17h48080cab75666063E.1.rel
This is what the spec says:
It can be seen that the correct computation is (+*target_1) + (-*target_2) = *target_1 - *target_2
.
Please review and point out if there is a step that went wrong.
The psABI-specification that I used my main reference is available here:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/download/draft-20240829-13bfa9f54634cb60d86b9b333e109f077805b4b3/riscv-abi.pdf
EDIT: I made this as clean and transparent as possible for more convenient review experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can comment my logic but I cannot comment on evidence as you did not explicitly point out the evidence. That said, I assume that we are talking about the doom test.
I'll lookup the doom example next for comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Okay, so the offset is correct-ish, but it has the wrong sign."
A comment by @koute in: #247 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also need a comment on whether there is "a test" or "tests" (plural) failing. I see in the CI run "a test" saying but your comment contradicts that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koute thanks for the feedback :-) I'll have now new ideas how to approach this! All good.
#![no_std] | ||
#![no_main] | ||
|
||
extern crate core; | ||
|
||
use core::fmt::Write; | ||
use polkavm_derive::polkavm_export; | ||
|
||
pub enum Foo { | ||
Success, | ||
CalleeTrapped, | ||
Unknown, | ||
} | ||
|
||
impl ::core::fmt::Debug for Foo { | ||
#[inline] | ||
fn fmt(&self, f: &mut ::core::fmt::Formatter) -> ::core::fmt::Result { | ||
::core::fmt::Formatter::write_str( | ||
f, | ||
match self { | ||
Foo::Success => "Success", | ||
Foo::CalleeTrapped => "CalleeTrapped", | ||
Foo::Unknown => "Unknown", | ||
}, | ||
) | ||
} | ||
} | ||
|
||
struct Writer; | ||
impl core::fmt::Write for Writer { | ||
fn write_str(&mut self, s: &str) -> core::fmt::Result { | ||
unsafe { | ||
crate::debug_message(s.as_ptr(), s.len() as u32); | ||
} | ||
Ok(()) | ||
} | ||
} | ||
|
||
#[polkavm_derive::polkavm_import] | ||
extern "C" { | ||
pub fn debug_message(str_ptr: *const u8, str_len: u32); | ||
} | ||
|
||
#[polkavm_export(abi = polkavm_derive::default_abi)] | ||
pub fn deploy() { | ||
let mut m = Writer {}; | ||
let _ = write!(&mut m, "{:?}", Foo::Success); | ||
} | ||
|
||
#[panic_handler] | ||
fn panic(_info: &core::panic::PanicInfo) -> ! { | ||
unsafe { | ||
core::arch::asm!("unimp"); | ||
core::hint::unreachable_unchecked(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the proper way to regression test this.
There's no guarantee that the compiler will actually emit the code which triggers the issue (and the issue very much depends on how exactly the compiler will compile this), unless we freeze the compiler version and the flags we use, which we don't want to do.
The best way to do this would probably be something like this:
- Compile the program.
- Disassemble it.
- Strip down the assembly to the bare minimum.
- Add the test as an assembly source code and reassemble it to produce a binary. (It's fine to commit the binary to not require everyone to install RISC-V assembler, but the blob should be reproducible for those who want to rebuild it.)
This would also allow the removal of part of the code which are unrelated to the core issue (e.g. the debug_message
call is unnecessary and should be removed, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using global_asm!() for wrapping it up? It's fairly robust
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using global_asm!() for wrapping it up? It's fairly robust
That would be also acceptable, if it can be made to work.
(The major point here is that you need very specific relocations to be emitted to reproduce this bug, so it's not just an issue of emitting the right assembly but also getting the relocations right; you need to force the relocations to be emitted and not be mangled by the linker, etc.; no idea if that's doable with global_asm!
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should work even in that case. It supports all of the shenanigans that assembler provides, i.e. global_asm!(include_str!("test.S"))
can be done and it should compile.
So I think this involves:
- Emit assembly:
RUSTFLAGS="--emit asm" cargo build
- Post-edit assembly to something reasonable.
- A minimal wrapper and
global_asm!(include_str!("test.S"))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second alternative:
- "
- "
- Use https://docs.rs/cc/latest/cc/ to compile assembly in build.rs of
crates/polkavm
.
This is almost what you suggested but has the benefit that we could edit the assembly code if we ever want to. I'll just try which is the most lean option for us (I honestly don't know before trying them out).
Converted to draft up until the first flush of issues has been fixed. Thank you for the reviews. I'll take this habit given the feedback from @athei for polkavm-test-data. |
Signed-off-by: Jarkko Sakkinen <[email protected]>
I wrote a Python script that discovers and computes table relocations: https://gist.github.com/jarkkojs/1f64ab5b1c92deec7d75b23504f7d890 For the binary stored in test-data:
I'll check if these match the Rust code next. It's a good reference model (also to update). |
Relocation for ADD32/SUB32 relocation pair has origin/target wrong order. Swap origin and target.