Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible race condition when trap in spawn thread (using WASI threads) #1909

Closed
Tracked by #1790
eloparco opened this issue Jan 23, 2023 · 4 comments
Closed
Tracked by #1790

Comments

@eloparco
Copy link
Contributor

To reproduce the problem, follow the description here #1869 (comment).

@loganek loganek mentioned this issue Jan 23, 2023
19 tasks
@loganek
Copy link
Collaborator

loganek commented Jan 23, 2023

As this is difficult to reproduce issue, I'd suggest compiling the runtime with sanitizers and see if we can get some info from that.

@hritikgupta
Copy link
Contributor

hritikgupta commented Jan 24, 2023

I wasn't able to reproduce the issue, but tried compiling runtime with 2 sanitiser modes i.e. address and thread as suggested,

with address, it gives the below error (this is apparently pointing to a line in some other file in summary):

==47563==ERROR: AddressSanitizer: SEGV on unknown address 0x68a20000689f (pc 0x000104142cf0 bp 0x00016bccabf0 sp 0x00016bccabb0 T0)
==47563==The signal is caused by a UNKNOWN memory access.
    #0 0x104142cf0 in bh_list_elem_next bh_list.c:93
    #1 0x1041a5cec in wasm_cluster_wait_for_all_except_self thread_manager.c:965
    #2 0x104156580 in wasm_application_execute_main wasm_application.c:218
    #3 0x10413a4d4 in app_instance_main main.c:87
    #4 0x1041394f4 in main main.c:660
    #5 0x104555088 in start+0x204 (dyld:arm64e+0x5088)

==47563==Register values:
 x[0] = 0x000068a20000689f   x[1] = 0x0000000000000000   x[2] = 0x0000000000008000   x[3] = 0x0000000000000000  
 x[4] = 0x00000001041a90d4   x[5] = 0x000000016bcca900   x[6] = 0x000000016bcca900   x[7] = 0x0000000000000000  
 x[8] = 0x000068a20000689f   x[9] = 0x0000000000000002  x[10] = 0x0000000000000000  x[11] = 0x0000000000000002  
x[12] = 0x0000000000000002  x[13] = 0x0000000000000000  x[14] = 0x0000000000000000  x[15] = 0x0000000000000000  
x[16] = 0x0000000180dc2d4c  x[17] = 0x00000001049c0740  x[18] = 0x0000000000000000  x[19] = 0x000000016bccb100  
x[20] = 0x0000000104137a84  x[21] = 0x00000001045b0070  x[22] = 0x0000000000000000  x[23] = 0x0000000000000000  
x[24] = 0x0000000000000000  x[25] = 0x0000000000000000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000  
x[28] = 0x0000000000000000     fp = 0x000000016bccabf0     lr = 0x00000001041a4c54     sp = 0x000000016bccabb0  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV bh_list.c:93 in bh_list_elem_next
==47563==ABORTING
zsh: abort      ./iwasm -v=5 wasm-apps/thread_termination.wasm

the above error was also intermittent.

whereas with thread, the sanitiser didn't give any warning/error.

@loganek
Copy link
Collaborator

loganek commented Jan 24, 2023

I wasn't able to reproduce the issue reported by @eloparco , but I'm regularly able to reproduce a different one:

iwasm: ../nptl/pthread_mutex_lock.c:81: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

It might be related as it crashes in atomic.notify. Attaching coredump and executable for analysis.
assertion.zip

@eloparco
Copy link
Contributor Author

Fixed by multiple commits for thread safety in main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants