-
Notifications
You must be signed in to change notification settings - Fork 874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add capability for Vasprun to read KPOINTS_OPT data #3509
Conversation
Allow reading in KPOINTS_OPT data from vasprun.xml. Also get band structure from KPOINTS_OPT data, if requested. (May change that to the default behaviour.) I've tested it separately, but still need to test integration into the pymatgen code, once I get a working development build of pymatgen.
projected_magnetism_kpoints_opt -> projected_magnetisation_kpoints_opt
From issue materialsproject#3455 , dataset supplied by Jeff-oakly materialsproject#3455 (comment) Although, for some reason, even though LORBIT = 11, vasprun.xml doesn't have projected eigenvalues for kpoints_opt. Wierd.
It was with VASP 6.4 rather than 6.3, if that makes a difference. And more CPUs so it snuck in more bands, which slightly changed some energy.
if KPOINTS_OPT data is present.
Prior name was not descriptive of the tests being done.
…tgen into vasprun-kpoints-opt
in both Vasprun and BSVasprun
3c23114
to
36e289c
Compare
improve test err message on bad vbm/cbm kpoint labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! All good except for the large new file tests/files/kpoints_opt/vasprun.xml.gz
. Can you check if we can use it in place of one of the existing vasprun.xml
files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@esoteric-ephemera If you're not too busy, could you take a look and let me know if the changes in this file all make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry I will soon!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These overall look pretty reasonable to me. @bfield1, are the changes to the structure of the XML parsing needed? It seems like you've added explicit triggers for when XML blocks are read.
Also, if a user specifies a KPOINTS_OPT file, it might make more sense to put the various *_opt
quantities without the suffix?
For many use cases, where KPOINTS_OPT isn't used, there would be 5 new top-level attrs that default to None
, which is OK. Maybe it makes more sense to have underscored attrs that default to None and store the quantities computed with KPOINTS/KSPACING when KPOINTS_OPT
is used?
For example, when no KPOINTS_OPT file is used, or LKPOINTS_OPT = False
:
self.eigenvalues_kpoints = < eigenvalues using KPOINTS / KSPACING >
self._eigenvalues_kpoints = None
and when KPOINTS_OPT is used:
self.eigenvalues_kpoints = < eigenvalues using KPOINTS_OPT >
self._eigenvalues_kpoints = < eigenvalues using KPOINTS / KSPACING >
Just to keep things more organized and hopefully reflect what a user expects to get from a Vasprun
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bfield1, are the changes to the structure of the XML parsing needed? It seems like you've added explicit triggers for when XML blocks are read.
Yes, we do need those changes. The prior 'end' condition for reading XML elements doesn't tell you which element it is nested within until after you have left that element. The 'start' condition is used to identify when we are entering certain blocks (but we have to wait until 'end' to read the data).
When using KPOINTS_OPT, some of the blocks can only be distinguished from their regular KPOINTS counterparts by knowing which block they are nested within. As such, I had to change the parser to read both start and end tags, and add a flag to track when we have entered (or left) certain blocks.
Regarding how to structure the data:
My choice of convention was mainly based on how the data is structured within the vasprun.xml file itself, as well as the calculation outputs. KPOINTS and KPOINTS_OPT are separate files, as are PROCAR and PROCAR_OPT. The vasprun.xml files stores the conventional KPOINTS data in the standard location and the KPOINTS_OPT data under special flags. The KPOINTS_OPT calculation is an extra non-self-consistent calculation run on top of the regular self-consistent calculation.
I also didn't want to change what the current variables in Vasprun
were pointing to without proper consultation.
That said, I am not opposed to making the KPOINTS_OPT data the default when it is present. You probably have a better idea of what the average user "expects" from a Vasprun
than I do. I only recently started using KPOINTS_OPT too, so different people may have different conventions.
One use-case of KPOINTS_OPT I was considering was doing a regular grid with normal KPOINTS (which you have to do) and a band structure with KPOINTS_OPT. In principle, one could get DOS data from the regular grid and band structure data from the KPOINTS_OPT data. But maybe this is a niche use case?
I would guess that typically someone cares most about the KPOINTS_OPT data when they specify it, so making that the prime data then using extra flags to get the underlying data probably makes sense. But also, I would expect the Vasprun
class to reflect the data structure of the vasprun.xml file, so listing KPOINTS_OPT separately also makes sense. I suppose it depends which we value more.
For what it's worth, I counted 10 new top-level attributes rather than 5. tdos_opt
, idos_opt
, pdos_opt
, efermi_opt
, eigenvalues_kpoints_opt
, projected_eigenvalues_kpoints_opt
, projected_magnetisation_kpoints_opt
, kpoints_opt
, actual_kpoints_opt
, and actual_kpoints_weights_opt
, which are all the KPOINTS_OPT equivalents of their non-opt counterparts and are listed in separate fields in vasprun.xml. I can appreciate that having lots of usually-empty attributes floating around might be a bit messy, though. If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?
Let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using KPOINTS_OPT, some of the blocks can only be distinguished from their regular KPOINTS counterparts by knowing which block they are nested within. As such, I had to change the parser to read both start and end tags, and add a flag to track when we have entered (or left) certain blocks.
this is very helpful information to understand why the code is as it is. could you insert this as a comment along with a date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, I counted 10 new top-level attributes rather than 5. [...] I can appreciate that having lots of usually-empty attributes floating around might be a bit messy, though. If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?
would be interested to hear @utf opinion on this and also if it affects atomate2
in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bfield1! This is super helpful context and I appreciate the work you've put into this.
Re:
In principle, one could get DOS data from the regular grid and band structure data from the KPOINTS_OPT data. But maybe this is a niche use case?
This is actually the default behavior of the Materials Project's bandstructure workflows, which means we should probably save some compute effort and revise these after this PR is merged. As they stand currently, two separate self-consistent calculations are performed to get the DOS from a uniform k-mesh, and the bandstructure from a line-mode k-mesh
If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?
I might lean towards this approach, or perhaps defining a small container class (e.g., electronic_properties
) that contains these 10 attrs (electronic_properties.tdos
, electronic_properties.idos
,...). The container class would be a top-level attr of Vasprun
. Hope that makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very helpful information to understand why the code is as it is. could you insert this as a comment along with a date?
Done.
This is actually the default behavior of the Materials Project's bandstructure workflows, which means we should probably save some compute effort and revise these after this PR is merged.
You'd probably want to double-check that VASP prints its output to vasprun.xml as expected when using KPOINTS_OPT, as it's a new feature so may have some bugs (the compilation of VASP I'm using doesn't write the projected eigenvalues to vasprun.xml properly, for instance). But if VASP behaves you might be able to save some effort, yes.
I might lean towards this approach, or perhaps defining a small container class (e.g., electronic_properties) that contains these 10 attrs (electronic_properties.tdos, electronic_properties.idos,...). The container class would be a top-level attr of Vasprun. Hope that makes sense
I have now implemented such a solution. A simple container class (_ElectronicPropertiesContainer
) (which I made MSONable because it seemed straightforward) which is used in the attribute Vasprun.kpoints_opt_properties
. It holds the electronic properties with their normal names (without the suffixes). The names might be a little wordy, but it's probably better to be more verbose than less.
In principle, one could use such a container for the regular electronic properties too, but that's probably an unnecessary extra layer with no obvious benefits. Unless, maybe, we have setters/getters to the top-level attributes which intelligently decide whether to access the kpoints_opt or regular data, but that's getting ahead of ourselves. People can request such a feature if demand arises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these changes @bfield1! I'm happy with the current state of this PR. Re the "non-opt" electronic properties: just to avoid breaking changes, I prefer the way you've done things by not storing them in an _ElectronicPropertiesContainer
.
The option to transition to a container class for these is there, but not necessary now.
tweak var names
Signed-off-by: Bernard Field <[email protected]>
Signed-off-by: Bernard Field <[email protected]>
Now they're referenced as attributes of that container (kpoints_opt_properties) rather than being top-level quantities in Vasprun.
Signed-off-by: Bernard Field <[email protected]>
pymatgen/io/vasp/outputs.py
Outdated
@@ -140,6 +140,50 @@ def _vasprun_float(flt: float | str) -> float: | |||
raise exc | |||
|
|||
|
|||
class _ElectronicPropertiesContainer(MSONable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm confused on what the consensus re _ElectronicPropertiesContainer
is in this PR. to me it looks like boilerplate. not sure what the benefits are. if i understand @esoteric-ephemera's last comment, he's also (no longer?) in favor of keeping it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think this is good to keep things organized. The alternative is to have 10 new attrs on Vasprun.*_opt
that are None
whenever a KPOINTS_OPT
file isn't used. That's fine but (in my view) messy for most use cases.
Either solution (removing the container class or having the *_opt
attrs directly on Vasprun
) works tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janosh As @esoteric-ephemera said, this container is useful for storing these extra variables.
The earlier remark about not storing things in the container was regarding the pre-existing electronic properties. I had briefly floated the idea of putting all the electronic properties into these containers (and using getters to choose them intelligently), but decided against it to avoid breaking changes and unnecessary complexity. But the container is still good for the KPOINTS_OPT attributes.
d725325
to
dca98be
Compare
… dataclass which have 1st class MSONable support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm a bit concerned here that the tests pass even if you remove all attributes from
@dataclass
class KpointOptProps:
"""Simple container class to store KPOINTS_OPT data in a separate namespace. Used by Vasprun."""
tdos: Dos | None = None
idos: Dos | None = None
pdos: list | None = None
efermi: float | None = None
eigenvalues: dict | None = None
projected_eigenvalues: dict | None = None
projected_magnetisation: np.ndarray | None = None
kpoints: Kpoints | None = None
actual_kpoints: list | None = None
actual_kpoints_weights: list | None = None
dos_has_errors: bool | None = None
i.e.
@dataclass
class KpointOptProps:
"""Simple container class to store KPOINTS_OPT data in a separate namespace. Used by Vasprun."""
works fine too. the attributes help with IDE auto-complete and type checking of course but maybe @bfield1 can think about more stringent tests for this?
either way, merging because of the very long wait time. sorry about that.
My rationale was that some vasprun.xml files would not have all that data in them (e.g. if your calculation doesn't have magnetism, it wouldn't have a But I can indeed consider some way to test this behaviour and ensure it acts as expected. I'll open another pull request once I get around to it. Anyway, thanks for working with me on this. |
Summary
This pull request resolves Issue #3455 , implementing parsing functionality for band structures generated using KPOINTS_OPT.
KPOINTS_OPT is a new VASP feature which provides an alternative method for specifying k-points for hybrid band structures.
The KPOINTS_OPT-related data is now read in by the Vasprun class, and
Vasprun.get_band_structure
defaults to reading the KPOINTS_OPT-based eigenvalues if present.Unfinished tasks
vasprun.xml
, but it doesn't have a convenient getter. Regardless, the other features of this pull request stand without this feature.Checklist
ruff
.mypy
.duecredit
@due.dcite
decorators to reference relevant papers by DOI (example)