Consistent Molecule Trajectories across Checkpoints and Checkpoint is performed using Boost Serialization Library #351
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There are 2 changes in this PR:
We added the original indices of each molecule into the original trajectory frame, so that as we permute the order of the underlying molecules (through outputting box 0 and box 1 pdb/psf files and then reloading them) we maintain consistent ordering. This will be important for diffusion calculations and single molecule tracking.
Checkpoint was originally dumping raw struct values to binary format. While we did check for endianness, this is unnecessarily low level. The Boost library the most downloaded C++ library (~7k downloads a week), and we believe including it in GOMC to be an appropriate choice for abstracting away serialization. This will make cross-platform checkpointing effortless and make the checkpointing process more readable to future developers.
A Google Test was written to check the coordinates of each respective atom in last frame written and the restart PDB files to verify we are correctly writing our trajectories. Each atom was given a unique chain ID and then compared in the two files (PDB Traj and PDB Restart). Since I am having trouble writing a DCD Recalculate Trajectory module, I couldn't write a GTest for the DCD Trajectory, but since I follow the same process for writing DCD trajectories as PDB trajectories, I am extremely confident the DCD Trajectories are also correct.
The Boost library is automatically downloaded if not found during the build process. I didn't have any problems simply calling "./metamake.sh", but there is a possibility that metamake may need to be called twice (once to download boost, remove the bin directory to eliminate the CMakeCache.txt file, and then again to build with boost found).
Currently boost is using a text archive based format, since it is machine-independent. It is trivial to change it to binary format, but we will lose machine independence at the gain of 2x speedup. Often, checkpoint is only called a few times during a simulation, and I don't think the speedup is worth the dependence on machine. We could also make it a config file option. This is a judgement call that @jpotoff could perhaps weigh in on.