Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGABRT: abort crashes starting with Alpine 3.19 #4488

Closed
webmaster128 opened this issue Mar 13, 2024 · 3 comments · Fixed by #4836
Closed

SIGABRT: abort crashes starting with Alpine 3.19 #4488

webmaster128 opened this issue Mar 13, 2024 · 3 comments · Fixed by #4836
Labels
bug Something isn't working build-system The Wasmer repo's build and CI system 📦 lib-compiler About wasmer-compiler 📦 lib-vm About wasmer-vm priority-high High priority issue project-confio
Milestone

Comments

@webmaster128
Copy link
Contributor

webmaster128 commented Mar 13, 2024

Hey there!

We got multiple reports from users that bring up "SIGABRT: abort" crashes in systems using Alpine 3.19. If Alpine 3.17 or 3.18 is used, everything is fine. I cannot easily reproduce it yet but would like to provide as much info as I have at this point to avoid extra work.

Describe the bug

We are seeing crashes in this stack when Alpine 3.19 is used

(gdb) bt
#0  a_crash () at ./arch/x86_64/atomic_arch.h:108
#1  abort () at src/exit/abort.c:27
#2  0x00007fbe28db00e1 in ?? () from /usr/lib/libgcc_s.so.1
#3  0x00007fbe28dc867a in __deregister_frame () from /usr/lib/libgcc_s.so.1
#4  0x0000000002e95f78 in <wasmer_compiler::engine::unwind::systemv::UnwindRegistry as core::ops::drop::Drop>::drop ()
#5  0x0000000002c4d515 in core::ptr::drop_in_place<wasmer_compiler::engine::code_memory::CodeMemory> ()
#6  0x0000000002c4e4d2 in core::ptr::drop_in_place<std::sync::mutex::Mutex<wasmer_compiler::engine::inner::EngineInner>> ()
#7  0x0000000002c43e0d in alloc::sync::Arc<T,A>::drop_slow ()
#8  0x0000000002c4b6f8 in core::ptr::drop_in_place<wasmer::engine::Engine> ()
#9  0x000000000320ce98 in cosmwasm_vm::cache::Cache<A,S,Q>::save_wasm_unchecked ()
#10 0x0000000002c45da8 in wasmvm::cache::do_save_wasm ()

where save_wasm_unchecked just takes a Wasm code, compiles it and stores it to disk (no instance or execution).

The affected systems show that the build system Alpine 3.19 makes the difference here:
Bildschirmfoto 2024-03-12 um 18 22 02

Looking at the stack trace you see that libgcc is used on the Alpine system which is not what unsafe fn register_frames( seems to expect.

It turns our that the exact same problem is discussed and fixed in Wasmtime: bytecodealliance/wasmtime#7997

Previously this decision was static. FreeBSD and Linux glibc would assume libgcc and everything else was assumed to be libunwind. It's possible to use libgcc on other platforms, however, such as with musl.

Steps to reproduce

Unfortunately I don't have a minimal reproducer yet

Expected behavior

No crashes

Actual behavior

Crashes as above

Additional context

Wasmer 4.2.2 and 4.2.6 behave the same way

Copy link

linear bot commented Mar 13, 2024

@theduke theduke added bug Something isn't working priority-medium Medium priority issue build-system The Wasmer repo's build and CI system 📦 lib-compiler About wasmer-compiler 📦 lib-vm About wasmer-vm labels Mar 19, 2024
dadamu added a commit to desmos-labs/desmos that referenced this issue Apr 2, 2024
## Description

Closes: #XXXX

This PR drops alpine of building environment to 3.18 to avoid from the
issues with wasmer.

References:
CosmWasm/wasmvm#523
wasmerio/wasmer#4488

<!-- Add a description of the changes that this PR introduces and the
files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is
not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type
prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json)
in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR
Targeting](https://github.com/desmos-labs/desmos/blob/master/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building
modules](https://docs.cosmos.network/v0.44/building-modules/intro.html)
- [ ] included the necessary unit and integration
[tests](https://github.com/desmos-labs/desmos/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go
code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable
and please add
your handle next to the items reviewed if you only reviewed selected
items.*

I have...

- [ ] confirmed the correct [type
prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json)
in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **Chores**
- Updated the base image for the Desmos Builder to
`golang:1.20-alpine3.18` for improved stability and performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@syrusakbary syrusakbary added this to the v4.3.1 milestone May 2, 2024
@syrusakbary syrusakbary added priority-high High priority issue and removed priority-medium Medium priority issue labels Jun 6, 2024
@xdoardo xdoardo linked a pull request Jun 12, 2024 that will close this issue
@xdoardo
Copy link
Contributor

xdoardo commented Jun 12, 2024

Hello! I was able to reproduce the issue pointed out in bytecodealliance/wasmtime#7997, but unfortunately I was not able to reproduce this one. I tried to recreate the state outlined in save_wasm_unchecked, that is

fn test() -> anyhow::Result<()> {
    for i in 0.. {
        println!("{i}");
        let mut config = wasmer::Cranelift::new();
        config.canonicalize_nans(true);
        wasmer::CompilerConfig::push_middleware(
            &mut config,
            std::sync::Arc::new(wasmer_middlewares::Metering::new(0, |_| 0)),
        );
        let engine = wasmer::Engine::from(config);
        let module = wasmer::Module::new(&engine, include_bytes!("test.wasm"))?;
        drop(engine);
        drop(module);
    }

    Ok(())
}

fn main() -> anyhow::Result<()> {
    test()?;
    Ok(())
}

The test function is a result of what's outlined in the above-mentioned issue, the save_wasm_unchecked function, and #1568. I tested with this docker image:

FROM rust:1.75-alpine3.19
RUN apk add musl-dev gcc
COPY drop /root/drop
ENV RUSTFLAGS='-Ctarget-feature=-crt-static'
WORKDIR /root/drop
RUN cargo build --verbose

FROM rust:1.75-alpine3.19
COPY --from=0 /root/drop/target/debug/drop /bin/drop
CMD ["/bin/drop"]

After various unsuccessful iterations of the test on both aarch64 and x86_64, I simply ported the patch from wasmtime to our SysV unwind mod. If there's any more information that you can share to reproduce the crash it'd be greatly appreciated - alternatively, if you can confirm that this patch solves the issue on your side, we can proceed with our tests and get this merged.

Edit: selecting Singlepass as engine allowed me to reproduce the error and check that the patch mentioned above solves it. Nonetheless, before merging the patch and closing this issue, it'd be great if you can confirm that it solves the issue on your side as well!

@syrusakbary
Copy link
Member

At the end were able to reproduce the issue (with a loop running and compiling modules many times), and then test that with the fix the issue no longer occurs.

So this should be resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build-system The Wasmer repo's build and CI system 📦 lib-compiler About wasmer-compiler 📦 lib-vm About wasmer-vm priority-high High priority issue project-confio
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants