Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent Molecule Trajectories across Checkpoints and Checkpoint is performed using Boost Serialization Library #351

Merged
merged 18 commits into from
May 22, 2021

Conversation

GregorySchwing
Copy link
Collaborator

@GregorySchwing GregorySchwing commented May 19, 2021

There are 2 changes in this PR:

  1. We added the original indices of each molecule into the original trajectory frame, so that as we permute the order of the underlying molecules (through outputting box 0 and box 1 pdb/psf files and then reloading them) we maintain consistent ordering. This will be important for diffusion calculations and single molecule tracking.

  2. Checkpoint was originally dumping raw struct values to binary format. While we did check for endianness, this is unnecessarily low level. The Boost library the most downloaded C++ library (~7k downloads a week), and we believe including it in GOMC to be an appropriate choice for abstracting away serialization. This will make cross-platform checkpointing effortless and make the checkpointing process more readable to future developers.

  • A Google Test was written to check the coordinates of each respective atom in last frame written and the restart PDB files to verify we are correctly writing our trajectories. Each atom was given a unique chain ID and then compared in the two files (PDB Traj and PDB Restart). Since I am having trouble writing a DCD Recalculate Trajectory module, I couldn't write a GTest for the DCD Trajectory, but since I follow the same process for writing DCD trajectories as PDB trajectories, I am extremely confident the DCD Trajectories are also correct.

  • The Boost library is automatically downloaded if not found during the build process. I didn't have any problems simply calling "./metamake.sh", but there is a possibility that metamake may need to be called twice (once to download boost, remove the bin directory to eliminate the CMakeCache.txt file, and then again to build with boost found).

  • Currently boost is using a text archive based format, since it is machine-independent. It is trivial to change it to binary format, but we will lose machine independence at the gain of 2x speedup. Often, checkpoint is only called a few times during a simulation, and I don't think the speedup is worth the dependence on machine. We could also make it a config file option. This is a judgement call that @jpotoff could perhaps weigh in on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant