-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forcefield molecular dynamics and forcefield refactor #722
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #722 +/- ##
==========================================
+ Coverage 76.64% 76.81% +0.17%
==========================================
Files 114 115 +1
Lines 8506 8723 +217
Branches 1275 1344 +69
==========================================
+ Hits 6519 6701 +182
- Misses 1600 1620 +20
- Partials 387 402 +15
|
@esoteric-ephemera great that you started this. One thing that will likely be really brittle are the trajectories. We need to find a solution for this in the taskdocs before merging this PR. This is already a problem for normal optimizations (okay, our structures are large but still). |
Firstly, thanks very much for this @esoteric-ephemera. It looks fantastic. As I just commented in #515, we have the ability to store trajectories in the datastore for VASP workflows (see materialsproject/emmet#886 for implementation details). I think we should add a similar option for the force field workflow and probably use the datastore by default since ML MD runs are likely to be much longer than AIMD. |
Thanks @JaGeo and @utf! I tried to mimic the On this point: for FF MD, one only needs the structure to reconstruct those other four quantities, so maybe storing just the structure for FF MD makes more sense (probably less data-efficient) |
One thing worth mentioning is when I trained MACE MP on MPtrj, the original file compiled by Bowen in JSON format is around 9 times bigger than extxyz. After compression the size difference is even three-fold. pymatgen trajectory has too many repetitive keys and brackets…. ASE trajectory is written in binary so it should be even more efficient than extxyz. We may want to have a separate MDTaskDoc supporting ASE Trajectory binary |
@esoteric-ephemera temperature and pressure schedule tests are working now. Not sure if we need to test all the calculators. It seems redundant to me if we only want to make sure schedule feature is working |
tests/forcefields/test_md.py
Outdated
assert all( | ||
output["from_str"].output.__getattribute__(attr) | ||
== output["from_dyn"].output.__getattribute__(attr) | ||
for attr in ("energy", "forces", "stress", "structure") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like a float comparison (except for structure
)? if so, should this use pytest.approx
(and getattr
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to use approx when appropriate. The intent was that two runs should be identical when run either explicitly specifying the dynamics object, or implicitly via str, but that is maybe too optimistic
I tried using getattr
, but the forcefield task doc has no defined getattr
method, hence the dunder
I can write these out explictly if that's preferable to the dunder. Or try to add a getattr
I see what you mean, replaced the __getattribute__
calls with getattr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code-wise this looks great! thanks again @esoteric-ephemera 👍
only remaining concern is test time. tests on the mlff_md
branch take 17 min vs 12 on the main branch. can we reduce the number of steps or system size in tests?
Tried to reduce the time for this by reducing the number of steps in the temperature / pressure MD tests, as well as only testing for CHGNet (MACE is 3x slow for these tests) @chiang-yuan: the temperature and pressure scheduling didn't seem right to me. When I ran the tests, the temperature schedule gave me I've modified the logic for the scheduling in a new function |
fix T/P schedule
Thanks @esoteric-ephemera !! Sorry I confuse the behavior of numpy interp and scipy interp1d. Thanks for the nice catch! I slightly changed a bit the syntax for schedule interpolation and make it seems (at least to me) explainable |
Perfect thanks! This looks much cleaner than my edit |
Hi @esoteric-ephemera. Thanks again for all the edits. Is this good to go now?
Did you have a chance to do this? |
Confirmed that the following works without issue: from emmet.core.tasks import TaskDoc
forcefield_taskdoc: ForceFieldTaskDocument = <output of some forcefield job>
vasp_style_taskdoc = TaskDoc(**forcefield_taskdoc.model_dump()) There are some optional fields that aren't populated ( For relax / static jobs, is there interest in having a |
Thanks for checking - |
Then yes, I think we're good to go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll go ahead and merge since everyone seems happy with this.
thanks so much @esoteric-ephemera, very excited to have MLFF MD makers in atomate2
! 🎉
Fantastic, thanks everyone who contributed to this PR and especially @esoteric-ephemera and @chiang-yuan! |
Great! Short question to the ML potential crowd: wouldn't a LAMMPS interface be even more beneficial? Especially for MD? I am aware of the atomate2-LAMMPS addon but it currently does not look ready to use (and is even archived: https://github.com/Matgenix/atomate2-lammps). |
never used LAMMPS myself, I think OpenMM might be the better fit for GPU-accelerated ML potentials based on what I heard from @orionarcher |
I've found OpenMM to be MUCH easier to use than LAMMPS and seen it run much faster on GPUs. Though I don't know how many popular ML potentials have been implemented, OpenMM has a plugin for integration with Pytorch. Caveat: I have only used OpenMM for classical MD and not ML potentials |
@orionarcher thanks! From the ML community side I always hear the LAMMPS wish. Just wanted to see if there are plans or opinions. (Cc @ml-evs?) |
…ct#722) * FF MD fully implemented * change MD defaults to be NVT + Langevin dynamics * add md defaults that are as consistent with vasp implementation as possible * correct typing of langevin friction coeff * Add option to ForceFieldTaskDocument to store trajectory information as pmg object * Add tests, remove dependence on pickle for trajectory i/o * fix tests, update matgl dependence * rename MD forcefield class vars to be consistent with VASP MDSetGenerator * Add option to explicitly specify thermostat as ASE MolecularDynamics object * refactoring and add T/P schedule * implement T/P schedule and attach callback func * np.nan * np.nan * small fixes * pass ASE dynamics object instead of class, update docstring * add temp ramp test and make sure upper tria matrix cell * add stresses to trajeoctroy observer, add pressure schedule test * avoid `stresses` key not in taskdoc bug * Refactory TrajectoryObserver, add option to save trajectory as either pmg or ase trajectory objects, add test for trajectory parsing * Change decomposition to upper triangular cell only when Nose-Hoover requested * Fix failing tests * Change MD default TaskDoc ionic step to only store mandatory info * update pymatgen requirement because of CifParser deprecation * Significant refactor to all forcefield jobs. Use commmon ASE calc structure, allow loading via MontyDecoder. Add check in relaxation for force convergence, attr in Forcefield taskdoc * Fix forcefield utils test and lint * Add revert_dtype env for running forcefield relax and md jobs / undo removal * refactor nequip jobs, add md default for nequip * Add GAP and Nequip MD tests, fix arg / kwarg passing in MD * remove calculator_args / ase_calculator args, revert phonon job change * remove todo about adding magmoms to forcefield traj observer as that's now implemented * Remove comments, add option to seed rng for MB velocities, turn on ideal gas stress contribution only when MD trajectory outputs (temp / velocity) are stored * Ensure CHGNet and M3GNet relax / static makers convert stress to eV/A0**3; add tests for MD NVE ensemble and specifying MolecularDynamics object as input * fix M3GNet test related to outdated cached model * Fix dependencies, amend forcefield static option parsing * linting * try to fix ci tests caused by torch / dgl incompatibility * try to fix ci tests caused by torch / dgl incompatibility * Change: default time_step, units for time_step, initialization of temperature and velocity in TrajectoryObserver, snake case change * linting / pre-commit * revert default time_step units to fs * add missing forcefields to version check in forcefield task document * allow complex schur decomposition * more snake case * Decrease number of steps in tests for speed. Fix (?) temperature / pressure scheduling * linting * fix T/P schedule * linting * replace dunder __getattribute__ --> getattr in forcefield test_md --------- Co-authored-by: Yuan Chiang <[email protected]> Co-authored-by: Yuan (Cyrus) Chiang <[email protected]> Co-authored-by: Janosh Riebesell <[email protected]>
Completed:
atomate2.forcefields.md.ForceFieldMDMaker
atomate2.forcefields.utils.Relaxer
, but I don't see a better way to merge those approaches for trajectory observance without a messier implementationForceFieldMDMaker
for CHGNet, M3GNet, and MACEmatgl
dependence to 1.0.0 (needed to use ASE calculator wrapper for M3GNet)pymatgen.core.trajectory.Trajectory
orase.io.Trajectory
objectsatomate2.forcefields.utils.ase_calculator
object. One can either:atomate2.forcefield.MLFF
members,MontyDecoder
atomate2.forcefields.utils.Relaxer
which is later passed toForceFieldTaskDocument
to resolve Convergence check for relaxation with forcefields #753.To Do's:
MDMaker
vs. itsmake
function. For example, it may be easier to set the temperature and pressure viaMDMaker().make
than viaMDMaker(temperature=...)