Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add capability for Vasprun to read KPOINTS_OPT data #3509

Merged
merged 37 commits into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
ce30012
Add KPOINTS_OPT reading functionality to Vasprun
bfield1 Dec 8, 2023
909b98b
Update docstrings, fix names in Vasprun
bfield1 Dec 9, 2023
d3c136b
Test case for Vasprun kpoints_opt
bfield1 Dec 9, 2023
33aeb14
Bandstructure test for kpoints_opt vasprun.xml
bfield1 Dec 9, 2023
f750277
Ran my own calculation, so have projected data
bfield1 Dec 9, 2023
50c833b
Unittest for projected kpoints_opt
bfield1 Dec 9, 2023
1bda574
Change use_kpoints_opt to be default behaviour
bfield1 Dec 9, 2023
4c89b40
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Dec 10, 2023
4ad198a
Update borg test to account for new test documents
bfield1 Dec 10, 2023
4ac674c
Rename new test file directory to kpoints_opt
bfield1 Dec 10, 2023
400776c
pre-commit auto-fixes
pre-commit-ci[bot] Dec 10, 2023
6575ba4
Code linting from ruff (whitespace, line length)
bfield1 Dec 10, 2023
1b59aaa
Merge branch 'vasprun-kpoints-opt' of https://github.com/bfield1/pyma…
bfield1 Dec 10, 2023
58d2979
Support for KPOINTS_OPT in BSVasprun
bfield1 Dec 11, 2023
3372988
as_dict now reads KPOINTS_OPT parameters
bfield1 Dec 11, 2023
6c9da5e
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Dec 20, 2023
73256ad
tweak Vasprun Attributes: doc str
janosh Dec 20, 2023
4497f08
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Jan 10, 2024
a691a45
Replace vasprun.xml for kpoints_opt with smaller file
bfield1 Jan 10, 2024
e12e73b
refactor get_band_structure
janosh Jan 10, 2024
9889c50
Merge remote-tracking branch 'upstream/master' into vasprun-kpoints-opt
bfield1 Jan 11, 2024
071690d
Merge remote-tracking branch 'upstream/master' into vasprun-kpoints-opt
bfield1 Jan 16, 2024
27f9cd1
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Jan 19, 2024
681a08a
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Jan 24, 2024
6295e29
Merge branch 'master' into vasprun-kpoints-opt
janosh Jan 26, 2024
56e854c
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Jan 26, 2024
a2d4df7
Explanatory comment on parser restructure
bfield1 Jan 26, 2024
cf51ec3
Stick all KPOINTS_OPT properties in a container
bfield1 Jan 26, 2024
2dc1d70
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Jan 30, 2024
931c85d
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Jan 31, 2024
583e259
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Feb 6, 2024
ecbd080
Merge branch 'master' into vasprun-kpoints-opt
bfield1 Feb 12, 2024
e40d40c
rename _ElectronicPropertiesContainer to KpointOptProps and turn into…
janosh Feb 23, 2024
712da14
rename Vasprun.kpoints_opt_properties to kpoint_opt_props
janosh Feb 23, 2024
0d090dd
Merge branch 'master' into pr/bfield1/3509
janosh Feb 23, 2024
58622ed
fix typo kpoint_opt_props -> kpoints_opt_props
janosh Feb 23, 2024
6c5d445
fix mypy
janosh Feb 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions dev_scripts/regen_libxcfunc.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,28 +45,28 @@ def parse_section(section):
return dct


def write_libxc_docs_json(xcfuncs, jpath):
def write_libxc_docs_json(xc_funcs, json_path):
"""Write json file with libxc metadata to path jpath."""
xcfuncs = deepcopy(xcfuncs)
xc_funcs = deepcopy(xc_funcs)

# Remove XC_FAMILY from Family and XC_ from Kind to make strings more human-readable.
for d in xcfuncs.values():
for d in xc_funcs.values():
d["Family"] = d["Family"].replace("XC_FAMILY_", "", 1)
d["Kind"] = d["Kind"].replace("XC_", "", 1)

# Build lightweight version with a subset of keys.
for num, d in xcfuncs.items():
xcfuncs[num] = {key: d[key] for key in ("Family", "Kind", "References")}
for num, d in xc_funcs.items():
xc_funcs[num] = {key: d[key] for key in ("Family", "Kind", "References")}
# Descriptions are optional
for opt in ("Description 1", "Description 2"):
desc = d.get(opt)
if desc is not None:
xcfuncs[num][opt] = desc
xc_funcs[num][opt] = desc

with open(jpath, mode="w") as file:
json.dump(xcfuncs, file)
with open(json_path, "w") as fh:
json.dump(xc_funcs, fh)

return xcfuncs
return xc_funcs


def main():
Expand Down
502 changes: 340 additions & 162 deletions pymatgen/io/vasp/outputs.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@esoteric-ephemera If you're not too busy, could you take a look and let me know if the changes in this file all make sense to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry I will soon!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overall look pretty reasonable to me. @bfield1, are the changes to the structure of the XML parsing needed? It seems like you've added explicit triggers for when XML blocks are read.

Also, if a user specifies a KPOINTS_OPT file, it might make more sense to put the various *_opt quantities without the suffix?

For many use cases, where KPOINTS_OPT isn't used, there would be 5 new top-level attrs that default to None, which is OK. Maybe it makes more sense to have underscored attrs that default to None and store the quantities computed with KPOINTS/KSPACING when KPOINTS_OPT is used?

For example, when no KPOINTS_OPT file is used, or LKPOINTS_OPT = False:

    self.eigenvalues_kpoints = < eigenvalues using KPOINTS / KSPACING >
    self._eigenvalues_kpoints = None

and when KPOINTS_OPT is used:

    self.eigenvalues_kpoints = < eigenvalues using KPOINTS_OPT >
    self._eigenvalues_kpoints = < eigenvalues using KPOINTS / KSPACING >

Just to keep things more organized and hopefully reflect what a user expects to get from a Vasprun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bfield1, are the changes to the structure of the XML parsing needed? It seems like you've added explicit triggers for when XML blocks are read.

Yes, we do need those changes. The prior 'end' condition for reading XML elements doesn't tell you which element it is nested within until after you have left that element. The 'start' condition is used to identify when we are entering certain blocks (but we have to wait until 'end' to read the data).
When using KPOINTS_OPT, some of the blocks can only be distinguished from their regular KPOINTS counterparts by knowing which block they are nested within. As such, I had to change the parser to read both start and end tags, and add a flag to track when we have entered (or left) certain blocks.

Regarding how to structure the data:

My choice of convention was mainly based on how the data is structured within the vasprun.xml file itself, as well as the calculation outputs. KPOINTS and KPOINTS_OPT are separate files, as are PROCAR and PROCAR_OPT. The vasprun.xml files stores the conventional KPOINTS data in the standard location and the KPOINTS_OPT data under special flags. The KPOINTS_OPT calculation is an extra non-self-consistent calculation run on top of the regular self-consistent calculation.
I also didn't want to change what the current variables in Vasprun were pointing to without proper consultation.

That said, I am not opposed to making the KPOINTS_OPT data the default when it is present. You probably have a better idea of what the average user "expects" from a Vasprun than I do. I only recently started using KPOINTS_OPT too, so different people may have different conventions.

One use-case of KPOINTS_OPT I was considering was doing a regular grid with normal KPOINTS (which you have to do) and a band structure with KPOINTS_OPT. In principle, one could get DOS data from the regular grid and band structure data from the KPOINTS_OPT data. But maybe this is a niche use case?

I would guess that typically someone cares most about the KPOINTS_OPT data when they specify it, so making that the prime data then using extra flags to get the underlying data probably makes sense. But also, I would expect the Vasprun class to reflect the data structure of the vasprun.xml file, so listing KPOINTS_OPT separately also makes sense. I suppose it depends which we value more.

For what it's worth, I counted 10 new top-level attributes rather than 5. tdos_opt, idos_opt, pdos_opt, efermi_opt, eigenvalues_kpoints_opt, projected_eigenvalues_kpoints_opt, projected_magnetisation_kpoints_opt, kpoints_opt, actual_kpoints_opt, and actual_kpoints_weights_opt, which are all the KPOINTS_OPT equivalents of their non-opt counterparts and are listed in separate fields in vasprun.xml. I can appreciate that having lots of usually-empty attributes floating around might be a bit messy, though. If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?

Let me know your thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using KPOINTS_OPT, some of the blocks can only be distinguished from their regular KPOINTS counterparts by knowing which block they are nested within. As such, I had to change the parser to read both start and end tags, and add a flag to track when we have entered (or left) certain blocks.

this is very helpful information to understand why the code is as it is. could you insert this as a comment along with a date?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, I counted 10 new top-level attributes rather than 5. [...] I can appreciate that having lots of usually-empty attributes floating around might be a bit messy, though. If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?

would be interested to hear @utf opinion on this and also if it affects atomate2 in some way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bfield1! This is super helpful context and I appreciate the work you've put into this.

Re:

In principle, one could get DOS data from the regular grid and band structure data from the KPOINTS_OPT data. But maybe this is a niche use case?

This is actually the default behavior of the Materials Project's bandstructure workflows, which means we should probably save some compute effort and revise these after this PR is merged. As they stand currently, two separate self-consistent calculations are performed to get the DOS from a uniform k-mesh, and the bandstructure from a line-mode k-mesh

If we wanted to tidy up, maybe we could store these attributes inside a dictionary so we only have one new top-level attribute?

I might lean towards this approach, or perhaps defining a small container class (e.g., electronic_properties) that contains these 10 attrs (electronic_properties.tdos, electronic_properties.idos,...). The container class would be a top-level attr of Vasprun. Hope that makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very helpful information to understand why the code is as it is. could you insert this as a comment along with a date?

Done.

This is actually the default behavior of the Materials Project's bandstructure workflows, which means we should probably save some compute effort and revise these after this PR is merged.

You'd probably want to double-check that VASP prints its output to vasprun.xml as expected when using KPOINTS_OPT, as it's a new feature so may have some bugs (the compilation of VASP I'm using doesn't write the projected eigenvalues to vasprun.xml properly, for instance). But if VASP behaves you might be able to save some effort, yes.

I might lean towards this approach, or perhaps defining a small container class (e.g., electronic_properties) that contains these 10 attrs (electronic_properties.tdos, electronic_properties.idos,...). The container class would be a top-level attr of Vasprun. Hope that makes sense

I have now implemented such a solution. A simple container class (_ElectronicPropertiesContainer) (which I made MSONable because it seemed straightforward) which is used in the attribute Vasprun.kpoints_opt_properties. It holds the electronic properties with their normal names (without the suffixes). The names might be a little wordy, but it's probably better to be more verbose than less.

In principle, one could use such a container for the regular electronic properties too, but that's probably an unnecessary extra layer with no obvious benefits. Unless, maybe, we have setters/getters to the top-level attributes which intelligently decide whether to access the kpoints_opt or regular data, but that's getting ahead of ourselves. People can request such a feature if demand arises.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes @bfield1! I'm happy with the current state of this PR. Re the "non-opt" electronic properties: just to avoid breaking changes, I prefer the way you've done things by not storing them in an _ElectronicPropertiesContainer.

The option to transition to a container class for these is there, but not necessary now.

Large diffs are not rendered by default.

5 changes: 1 addition & 4 deletions tests/apps/borg/test_queen.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,6 @@

__author__ = "Shyue Ping Ong"
__copyright__ = "Copyright 2012, The Materials Project"
__version__ = "0.1"
__maintainer__ = "Shyue Ping Ong"
__email__ = "[email protected]"
__date__ = "Mar 18, 2012"


Expand All @@ -22,7 +19,7 @@ def test_get_data(self):
drone = VaspToComputedEntryDrone()
self.queen = BorgQueen(drone, TEST_FILES_DIR, 1)
data = self.queen.get_data()
assert len(data) == 15
assert len(data) == 16

def test_load_data(self):
drone = VaspToComputedEntryDrone()
Expand Down
4 changes: 4 additions & 0 deletions tests/files/kpoints_opt/KPOINTS
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Fully automatic kpoint scheme
0
Automatic
25
33 changes: 33 additions & 0 deletions tests/files/kpoints_opt/KPOINTS_OPT
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Line_mode KPOINTS file
10
Line_mode
Reciprocal
0.0 0.0 0.0 ! \Gamma
0.5 0.0 0.5 ! X

0.5 0.0 0.5 ! X
0.5 0.25 0.75 ! W

0.5 0.25 0.75 ! W
0.375 0.375 0.75 ! K

0.375 0.375 0.75 ! K
0.0 0.0 0.0 ! \Gamma

0.0 0.0 0.0 ! \Gamma
0.5 0.5 0.5 ! L

0.5 0.5 0.5 ! L
0.625 0.25 0.625 ! U

0.625 0.25 0.625 ! U
0.5 0.25 0.75 ! W

0.5 0.25 0.75 ! W
0.5 0.5 0.5 ! L

0.5 0.5 0.5 ! L
0.375 0.375 0.75 ! K

0.625 0.25 0.625 ! U
0.5 0.0 0.5 ! X
Binary file added tests/files/kpoints_opt/vasprun.xml.gz
Binary file not shown.
89 changes: 88 additions & 1 deletion tests/io/vasp/test_outputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from pymatgen.core import Element
from pymatgen.core.lattice import Lattice
from pymatgen.core.structure import Structure
from pymatgen.electronic_structure.bandstructure import BandStructureSymmLine
from pymatgen.electronic_structure.core import Magmom, Orbital, OrbitalType, Spin
from pymatgen.entries.compatibility import MaterialsProjectCompatibility
from pymatgen.io.vasp.inputs import Kpoints, Poscar, Potcar
Expand Down Expand Up @@ -47,6 +48,8 @@
except ImportError:
h5py = None

kpts_opt_vrun_path = f"{TEST_FILES_DIR}/kpoints_opt/vasprun.xml.gz"


class TestVasprun(PymatgenTest):
def test_vasprun_ml(self):
Expand All @@ -72,7 +75,9 @@ def test_vasprun_md(self):
assert vasp_run.converged_ionic

def test_bad_random_seed(self):
_ = Vasprun(f"{TEST_FILES_DIR}/vasprun.bad_random_seed.xml.gz")
vasp_run = Vasprun(f"{TEST_FILES_DIR}/vasprun.bad_random_seed.xml.gz")
assert vasp_run.incar["ISMEAR"] == 0
assert vasp_run.incar["RANDOM_SEED"] is None

def test_multiple_dielectric(self):
vasp_run = Vasprun(f"{TEST_FILES_DIR}/vasprun.GW0.xml.gz")
Expand Down Expand Up @@ -704,6 +709,59 @@ def test_eigenvalue_band_properties_separate_spins(self):
assert props[3][0]
assert props[3][1]

def test_kpoints_opt(self):
vasp_run = Vasprun(kpts_opt_vrun_path, parse_projected_eigen=True)
# This calculation was run using KPOINTS_OPT
kpt_opt_props = vasp_run.kpoints_opt_props
# Check the k-points were read correctly.
assert len(vasp_run.actual_kpoints) == 10
assert len(vasp_run.actual_kpoints_weights) == 10
assert len(kpt_opt_props.actual_kpoints) == 100
assert len(kpt_opt_props.actual_kpoints_weights) == 100
# Check the eigenvalues were read correctly.
assert vasp_run.eigenvalues[Spin.up].shape == (10, 24, 2)
assert kpt_opt_props.eigenvalues[Spin.up].shape == (100, 24, 2)
assert vasp_run.eigenvalues[Spin.up][0, 0, 0] == approx(-6.1471)
assert kpt_opt_props.eigenvalues[Spin.up][0, 0, 0] == approx(-6.1536)
# Check the projected eigenvalues were read correctly
assert vasp_run.projected_eigenvalues[Spin.up].shape == (10, 24, 8, 9)
assert kpt_opt_props.projected_eigenvalues[Spin.up].shape == (100, 24, 8, 9)
assert vasp_run.projected_eigenvalues[Spin.up][0, 1, 0, 0] == approx(0.0492)
# I think these zeroes are a bug in VASP (maybe my VASP) transcribing from PROCAR_OPT to vasprun.xml
# No matter. The point of the parser is to read what's in the file.
assert kpt_opt_props.projected_eigenvalues[Spin.up][0, 1, 0, 0] == approx(0.0000)
# Test as_dict
vasp_run_dct = vasp_run.as_dict()
assert vasp_run_dct["input"]["nkpoints_opt"] == 100
assert vasp_run_dct["input"]["nkpoints"] == 10
assert vasp_run_dct["output"]["eigenvalues_kpoints_opt"]["1"][0][0][0] == approx(-6.1536)

def test_kpoints_opt_band_structure(self):
vasp_run = Vasprun(kpts_opt_vrun_path, parse_potcar_file=False, parse_projected_eigen=True)
bs = vasp_run.get_band_structure(f"{TEST_FILES_DIR}/kpoints_opt/KPOINTS_OPT")
assert isinstance(bs, BandStructureSymmLine)
cbm = bs.get_cbm()
vbm = bs.get_vbm()
assert cbm["kpoint_index"] == [38], "wrong cbm kpoint index"
assert cbm["energy"] == approx(6.4394), "wrong cbm energy"
assert cbm["band_index"] == {Spin.up: [16], Spin.down: [16]}, "wrong cbm bands"
assert vbm["kpoint_index"] == [0, 39, 40]
assert vbm["energy"] == approx(5.7562), "wrong vbm energy"
assert vbm["band_index"] == {Spin.down: [13, 14, 15], Spin.up: [13, 14, 15]}, "wrong vbm bands"
vbm_kp_label = vbm["kpoint"].label
assert vbm["kpoint"].label == "\\Gamma", f"Unpexpected {vbm_kp_label=}"
cmb_kp_label = cbm["kpoint"].label
assert cmb_kp_label is None, f"Unpexpected {cmb_kp_label=}"
# Test projection
projected = bs.get_projection_on_elements()
assert np.isnan(projected[Spin.up][0][0]["Si"])
# Due to some error in my VASP, the transcription of PROCAR_OPT into
# vasprun.xml is filled to the brim with errors in the projections.
# At some point we might get a healthier vasprun.xml, but the point here
# is to test the parser, not VASP.
projected = bs.get_projections_on_elements_and_orbitals({"Si": ["s"]})
assert projected[Spin.up][0][58]["Si"]["s"] == -0.0271

def test_parse_potcar_cwd_relative(self):
# Test to ensure that common use cases of vasprun parsing work
# in the current working directory using relative paths,
Expand Down Expand Up @@ -1322,6 +1380,35 @@ def test_get_band_structure(self):
dct = vasprun.as_dict()
assert "eigenvalues" in dct["output"]

def test_kpoints_opt(self):
vasp_run = BSVasprun(kpts_opt_vrun_path, parse_potcar_file=False, parse_projected_eigen=True)
bs = vasp_run.get_band_structure(f"{TEST_FILES_DIR}/kpoints_opt/KPOINTS_OPT")
assert isinstance(bs, BandStructureSymmLine)
cbm = bs.get_cbm()
vbm = bs.get_vbm()
assert cbm["kpoint_index"] == [38], "wrong cbm kpoint index"
assert cbm["energy"] == approx(6.4394), "wrong cbm energy"
assert cbm["band_index"] == {Spin.up: [16], Spin.down: [16]}, "wrong cbm bands"
# Strangely, when I call with parse_projected_eigen, it gives empty Spin.down,
# but without parse_projected_eigen it does not give it.
# So at one point it called the empty key.
assert vbm["kpoint_index"] == [0, 39, 40]
assert vbm["energy"] == approx(5.7562), "wrong vbm energy"
assert vbm["band_index"] == {Spin.down: [13, 14, 15], Spin.up: [13, 14, 15]}, "wrong vbm bands"
assert vbm["kpoint"].label == "\\Gamma", "wrong vbm label"
assert cbm["kpoint"].label is None, "wrong cbm label"
# Test projection
projected = bs.get_projection_on_elements()
assert np.isnan(projected[Spin.up][0][0]["Si"])
# Due to some error in my VASP, the transcription of PROCAR_OPT into
# vasprun.xml is filled to the brim with errors in the projections.
# At some point we might get a healthier vasprun.xml, but the point here
# is to test the parser, not VASP.
projected = bs.get_projections_on_elements_and_orbitals({"Si": ["s"]})
assert projected[Spin.up][0][58]["Si"]["s"] == -0.0271
vrun_dct = vasp_run.as_dict()
assert {*vrun_dct["output"]} >= {"eigenvalues", "eigenvalues_kpoints_opt"}


class TestOszicar(PymatgenTest):
def test_init(self):
Expand Down
Loading