Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1480: Treat pending asynchronous operations like message in flight to prevent TD hang detection false positives #1517

Merged
merged 5 commits into from
Aug 11, 2021

Conversation

PhilMiller
Copy link
Member

@PhilMiller PhilMiller commented Aug 10, 2021

Fixes #1480

@PhilMiller PhilMiller requested a review from lifflander August 10, 2021 19:02
@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (clang-3.9, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (gcc-5, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (gcc-6, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (gcc-9, ubuntu, mpich, zoltan)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (clang-5.0, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (intel 18.03, ubuntu, mpich)

Build for 16a5279

Build failed for unknown reason. Check build logs


Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (intel 19, ubuntu, mpich)

Build for 16a5279

Build failed for unknown reason. Check build logs


Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (clang-10, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (clang-9, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (nvidia cuda 11.0, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (nvidia cuda 10.1, ubuntu, mpich)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (gcc-7, ubuntu, mpich, trace runtime, LB)

Build for 16a5279

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Aug 10, 2021

PR tests (clang-10, alpine, mpich)

Build for 16a5279



The following tests FAILED:
  434 - vt:TestMpiAccessGuardDeathTest.test_mpi_access_prevented_proc_2 (Failed)

Build log

@PhilMiller PhilMiller force-pushed the 1480-asyncop-td-hang branch from a7144a5 to d2789c8 Compare August 10, 2021 22:14
@PhilMiller PhilMiller force-pushed the 1480-asyncop-td-hang branch from d2789c8 to 37f9b67 Compare August 10, 2021 22:17
@codecov
Copy link

codecov bot commented Aug 10, 2021

Codecov Report

Merging #1517 (16a5279) into develop (5318749) will decrease coverage by 0.11%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1517      +/-   ##
===========================================
- Coverage    82.88%   82.77%   -0.12%     
===========================================
  Files          780      780              
  Lines        29380    29394      +14     
===========================================
- Hits         24353    24331      -22     
- Misses        5027     5063      +36     
Impacted Files Coverage Δ
src/vt/termination/termination.h 100.00% <ø> (ø)
src/vt/messaging/async_op.cc 66.66% <100.00%> (ø)
src/vt/termination/termination.cc 71.58% <100.00%> (+0.76%) ⬆️
src/vt/vrt/collection/send/sendable.impl.h 63.15% <0.00%> (-36.85%) ⬇️
src/vt/vrt/collection/manager.h 88.00% <0.00%> (-12.00%) ⬇️
src/vt/topos/location/location.impl.h 90.49% <0.00%> (-3.69%) ⬇️
src/vt/vrt/collection/manager.impl.h 94.14% <0.00%> (-1.27%) ⬇️

@PhilMiller
Copy link
Member Author

Test failure on clang-10, apline, mpich was #1521 which is irrelevant to this change.

@PhilMiller PhilMiller merged commit 93e95a6 into develop Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AsyncOpCuda can produce TD hang false positives
2 participants