-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change how electrode flow retrieves charge density data #1055
base: main
Are you sure you want to change the base?
Conversation
@esoteric-ephemera does it make sense to add the data to tge datastore for other parts of the workflow? It will surely fail at some point |
It should work in this setup because the charge density is only used in one part of the flow ( Maybe @jmmshn has thoughts on this - don't want to break the flow logic |
@@ -204,7 +202,8 @@ def get_insertion_electrode_doc( | |||
|
|||
@job | |||
def get_inserted_structures( | |||
chg: VolumetricData, | |||
prev_dir: Path | str, | |||
get_charge_density: Callable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_charge_density: Callable, | |
get_charge_density: Callable[[str | Path, VolumetricData] |
@@ -213,7 +212,8 @@ def get_inserted_structures( | |||
|
|||
Parameters | |||
---------- | |||
chg: The charge density. | |||
prev_dir: The previous directory where the static calculation was performed. | |||
get_charge_density: A function to get the charge density from a task document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_charge_density: A function to get the charge density from a task document. | |
get_charge_density: A function to get the charge density from a run directory. |
@esoteric-ephemera, copying the file over was kind of a fireworks-related hack. |
20e0c81
to
d532d7f
Compare
d532d7f
to
59b7f26
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1055 +/- ##
==========================================
- Coverage 72.82% 4.13% -68.69%
==========================================
Files 187 187
Lines 13637 13634 -3
Branches 1370 1370
==========================================
- Hits 9931 564 -9367
- Misses 3161 13039 +9878
+ Partials 545 31 -514
|
59b7f26
to
44e6c33
Compare
@esoteric-ephemera , it looks like current version of
breaks for large objects because it need to send the entire charge density to the jobstore. So I don't see any way of not causing the problem with fireworks where the object is deserialized during viewing of the job in the webgui. Maybe it's best to move on completely from fireworks anyways. So I think the choices are:
I think we are already down the path of 2 mostly anyways so I can just complete refactor out the |
Hey @jmmshn, the way I've rewritten it in the PR shouldn't hit either issue you've identified - it just slightly tweaks the electrode flow to take the path of the charge density file and a function as args. Should not have issues with storing large objects in Mongo, nor with rendering large objects in the fireworks webgui We're still using fireworks, and I don't see a strong case to store the charge density in a database. Even in that case, the user can write a custom |
b01b885
to
5196e93
Compare
So in my testing the following job fails with a "Document too large" error. atomate2/src/atomate2/common/jobs/electrode.py Lines 303 to 318 in 95ea060
My current understanding of what is happening is that:
atomate2/src/atomate2/vasp/flows/electrode.py Lines 58 to 73 in 95ea060
So it will produced a document that contains a It looks like the fact that this result does not have a PS: Thanks for the fast reply! |
Yeah exactly what I've observed - adding |
That is essentially the tradeoff, this is a substantial amount of new data though so it might make sense to make
|
OK - so in the case of heterogeneous compute for a VASP job, we'd have to use something like Either way, I think this is worth looking at in a separate PR - the flow as it currently exists in atomate2 doesn't support heterogeneous compute, this PR is really just to ensure that the flow doesn't fail from mongodb object size limitations |
OK everything makes sense now. |
The old way was pretty messy. Refactored to be cleaner.
@esoteric-ephemera, I added some minor fixes to the job naming/numbering and I try help get this PR passed. Thanks for fixing this and shelping me clear up my confusion. I think I made a mistaking while writing the original wf and missed a place where the data is serialized to the jobstore. I'll fix those among other things once this PR is merged! |
No worries @jmmshn, that all sounds good to me |
Small change in the electrode insertion workflow to avoid needing large object storage: Currently there's a
get_charge_density_job
function which does not point its output to e.g. GridFS. That causes problems when trying to store a CHGCAR in normal mongo storesI've moved the
get_charge_density_job
withinget_inserted_structures
so that the charge density is not stored when generating inserted structures (it still may be stored in the user's GridFS from the static calc)Another solution would be just adding a
data=[Chgcar]
kwarg to theget_charge_density_job
function, but I think it makes more sense to removeget_charge_density_job