Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

510 - Assert-fail if MPI functions are accessed inside scheduled callbacks from within VT #792

Merged
merged 23 commits into from
Jun 2, 2020

Conversation

pnstickne
Copy link
Contributor

@pnstickne pnstickne commented Apr 28, 2020

When running in a non-release build, MPI access guard will be enabled by default.

These access guards utilize PMPI (ie. exposing MPI symbols first from VT) to intercept MPI calls.

If enabled, a call to [almost] any MPI_* function from a scheduler-executed handler will trigger a vtAssert. (Various MPI calls, such as WTime, Test, Probe, Get_count are excluded from checks for simplification of usages and benign or non-solo nature.)

Code inside VT explicitly grants itself scoped access to MPI functions via RAII hidden behind a conditional macro.

As an alternative/extension to the scope access, could also expose internal vt_MPI_xyz wrappers that would bypass (or disable the guard internally) and use those throughout VT. Would be trivial macro to MPI_xyz when disabled. It also might be possible to do similar with const auto vt_MPI_xyz = MPI_xyz; and rely on an optimizing compiler to elide the proxy call in release build (not guarded) cases.

Fixes #510

@pnstickne pnstickne force-pushed the 510-mpi-intercepts branch 3 times, most recently from 518333a to 85c15aa Compare April 28, 2020 08:40
@pnstickne pnstickne changed the title 510 - Assert-fail is MPI functions are accessed inside "user code" from within VT 510 Assert-fail if MPI functions are accessed inside "user code" from within VT Apr 28, 2020
@pnstickne pnstickne changed the title 510 Assert-fail if MPI functions are accessed inside "user code" from within VT 510 - Assert-fail if MPI functions are accessed inside scheduled callbacks from within VT Apr 28, 2020
@codecov
Copy link

codecov bot commented Apr 28, 2020

Codecov Report

Merging #792 into develop will increase coverage by 0.04%.
The diff coverage is 96.42%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #792      +/-   ##
===========================================
+ Coverage    80.02%   80.07%   +0.04%     
===========================================
  Files          342      343       +1     
  Lines        10696    10724      +28     
===========================================
+ Hits          8560     8587      +27     
- Misses        2136     2137       +1     
Impacted Files Coverage Δ
tests/unit/runtime/test_mpi_access_guards.cc 96.42% <96.42%> (ø)

@pnstickne pnstickne force-pushed the 510-mpi-intercepts branch from 85c15aa to f0dd2e0 Compare April 28, 2020 17:42
@pnstickne pnstickne marked this pull request as ready for review April 28, 2020 17:45
@PhilMiller
Copy link
Member

Looks like this is headed in the right direction. Could we get a negative test added to ensure that the guards actually catch an undesired call?

@pnstickne

This comment has been minimized.

@pnstickne
Copy link
Contributor Author

Looks like this is headed in the right direction. Could we get a negative test added to ensure that the guards actually catch an undesired call?

I've been thinking about how to handle such - there are several things that should test 'a vtAssert' occurred, although this is uhh, hard to catch in a test. It would also have to handle the process dying in gtest as state is all dead.

Guess that's 'ASSERT_DEATH' :|

@pnstickne pnstickne force-pushed the 510-mpi-intercepts branch from f0dd2e0 to 82a5a39 Compare May 12, 2020 05:20
@PhilMiller
Copy link
Member

I've rebased this and integrated/merged it with the PR that uses the safe MPI collectives.

It looks like you lost some of Paul's changes - there are things he's commented as done, that are back to what I'd remarked on as needing to change

- The casing of the value can differ between environments.
  (Despite documentation claims of allowed values.)
@pnstickne pnstickne force-pushed the 510-mpi-intercepts branch from 72a269e to 10bfe63 Compare May 30, 2020 23:13
pnstickne added 2 commits May 30, 2020 17:16
- All symbols now included to prevent breaking CMake/gtest regex scanning.

- Using MPI_Test for more reliable failure detection.

- Includes when access is explicitly granted in positive case.
@pnstickne pnstickne force-pushed the 510-mpi-intercepts branch from 457fca5 to d818865 Compare May 31, 2020 00:22
- Excluded from generation by request.
  Also added MPI_Wticks for parity with MPI_Wtime.

- Guard exclusions can be specified via regular expression patterns.
@pnstickne pnstickne force-pushed the 510-mpi-intercepts branch from 83e728f to 11c2fe3 Compare May 31, 2020 00:38
- Not needed, got re-included.

  The code in these handlers is expected to run in a context where
  MPI is used.
@lifflander
Copy link
Collaborator

I've generated and pushed a full list of 3.1 that includes the missing calls from before.

@lifflander lifflander force-pushed the 510-mpi-intercepts branch from bc7ea01 to 64faf90 Compare June 2, 2020 02:50
Copy link
Collaborator

Codacy Here is an overview of what got changed by this pull request:

Clones added
============
- src/vt/messaging/active.cc  2
         

See the complete overview on Codacy

@lifflander lifflander merged commit 070e761 into develop Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use PMPI to detect application code calling MPI under VT scheduler
3 participants