Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1490 Write new collection restore from file in place #1491

Merged
merged 9 commits into from
Aug 2, 2021

Conversation

lifflander
Copy link
Collaborator

Fixes #1490

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (intel 18.03, ubuntu, mpich)

Build for 5510c0a

Build failed for unknown reason. Check build logs


Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (intel 19, ubuntu, mpich)

Build for 5510c0a

Build failed for unknown reason. Check build logs


Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (clang-5.0, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (gcc-5, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (gcc-6, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (clang-3.9, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (gcc-9, ubuntu, mpich, zoltan)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (nvidia cuda 10.1, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (clang-10, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (nvidia cuda 11.0, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@codecov
Copy link

codecov bot commented Jun 30, 2021

Codecov Report

Merging #1491 (9f45a9e) into develop (18e615e) will increase coverage by 0.08%.
The diff coverage is 99.28%.

❗ Current head 9f45a9e differs from pull request most recent head 5510c0a. Consider uploading reports for the commit 5510c0a to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1491      +/-   ##
===========================================
+ Coverage    82.76%   82.84%   +0.08%     
===========================================
  Files          780      780              
  Lines        29219    29377     +158     
===========================================
+ Hits         24183    24338     +155     
- Misses        5036     5039       +3     
Impacted Files Coverage Δ
src/vt/vrt/collection/manager.impl.h 95.40% <97.77%> (+0.65%) ⬆️
src/vt/vrt/collection/manager.h 100.00% <100.00%> (ø)
tests/unit/collection/test_checkpoint.extended.cc 100.00% <100.00%> (ø)
src/vt/topos/location/location.impl.h 90.49% <0.00%> (-3.69%) ⬇️
src/vt/vrt/collection/send/sendable.impl.h 100.00% <0.00%> (+29.41%) ⬆️

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (gcc-7, ubuntu, mpich, trace runtime, LB)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (clang-9, ubuntu, mpich)

Build for 5510c0a

Compilation - successful

Testing - passed

Build log

@github-actions
Copy link

github-actions bot commented Jun 30, 2021

PR tests (clang-10, alpine, mpich)

Build for 5510c0a



The following tests FAILED:
  434 - vt:TestMpiAccessGuardDeathTest.test_mpi_access_prevented_proc_2 (Failed)

Build log

@@ -223,7 +223,67 @@ TEST_F(TestCheckpoint, test_checkpoint_1) {
// Ensure that all elements were properly destroyed
EXPECT_EQ(counter, 0);
}
}

TEST_F(TestCheckpoint, test_checkpoint_in_place_2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this test exercises the migration feature. Can you add another test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Collaborator

@nlslatt nlslatt Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for exercising migration. Does a test with the following sequence of events work? It's a critical application use case.

  1. Load balance
  2. Checkpoint
  3. Re-create the collection elements at home (or migrate them back home)
  4. Restore from the checkpoint

@PhilMiller PhilMiller changed the title 1490 Write new collection restore to file in place 1490 Write new collection restore from file in place Jul 1, 2021
@PhilMiller
Copy link
Member

Like the description of this pr, some of the commit messages say "restoreToFile" instead of From. The code is right, though.

Besides the comments I made, this looks good

@lifflander lifflander force-pushed the 1490-add-restore-to-file-in-place branch from 24e891f to 6afc35e Compare July 20, 2021 22:21
@lifflander lifflander force-pushed the 1490-add-restore-to-file-in-place branch from 6afc35e to 9f45a9e Compare July 28, 2021 17:07
Copy link
Collaborator

@nlslatt nlslatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@bathmatt
Copy link

@lifflander the auto merge of checkpoint and the rest happens tomorrow so if this is ready can we merge away?

@lifflander
Copy link
Collaborator Author

@lifflander the auto merge of checkpoint and the rest happens tomorrow so if this is ready can we merge away?

This is still failing tests. I'm working on it more to figure out what went wrong.

@lifflander lifflander merged commit 204a8dc into develop Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a restoreFromFileInPlace for application use case
4 participants