Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal binding data, trajectories, and the start of stereochemistry #706

Merged
merged 85 commits into from
Jun 16, 2023
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
a45a954
Small change, mostly to redox - only calculate reduction potential vs…
espottesmith Apr 19, 2023
ff915ef
Small type bugfix
espottesmith Apr 19, 2023
d41bc0c
Test changes to reflect changes to redox
espottesmith Apr 19, 2023
531d429
Added InChI and InChI-key to MoleculeMetadata (for searching, mostly)
espottesmith Apr 19, 2023
aebc988
Now have a way to extract optimization trajectories from Q-Chem Task …
espottesmith Apr 20, 2023
7af0618
API endpoint for geometry optimization trajectories
espottesmith Apr 20, 2023
896aabe
We really are adding new features at this point - starting work on Bi…
espottesmith Apr 21, 2023
a9f2dc0
Draft binding document. Now the fun part: builders and tests!
espottesmith Apr 25, 2023
d64a935
Merge branch 'main' into stats_bind_chiral
espottesmith Apr 25, 2023
1482650
Realized that it makes more sense to have different sub-docs within M…
espottesmith Apr 25, 2023
83c7118
Beginning work on metal binding builder. This one will be... complica…
espottesmith Apr 25, 2023
80d49cd
Progress on binding builder
espottesmith Apr 25, 2023
92d7e37
Full draft of the binding builder???
espottesmith Apr 27, 2023
ef787b8
Merge branch 'main' into stats_bind_chiral
espottesmith Apr 27, 2023
ee798e7
Small tweaks
espottesmith Apr 27, 2023
7f33465
Small tweak
espottesmith Apr 27, 2023
b71e7c3
Enable multiple metal binding methods
espottesmith Apr 27, 2023
2642d9c
Metal binding should be working; just need tests
espottesmith Apr 28, 2023
d8e0718
Small fix to thermo
espottesmith Apr 28, 2023
45d44a7
Trying to see if I can speed up the extremely slow association builder
espottesmith Apr 28, 2023
a6a2732
Moving InChI from MoleculeMetadata to MoleculeDoc, where it really ma…
espottesmith Apr 28, 2023
e628c20
Small bugfix
espottesmith Apr 28, 2023
3c67234
Some bugfixes with metal binding and summary
espottesmith Apr 28, 2023
1152aab
New tests and test files and all that
espottesmith Apr 28, 2023
a6e00cf
Tests pass; let's go
espottesmith Apr 28, 2023
288426e
Beginning of metal_binding API endpoint
espottesmith Apr 28, 2023
c2efc20
Flipped the sign of the binding energy/enthalpy/entropy/free energy
espottesmith Apr 28, 2023
1a4670d
Tests pass
espottesmith Apr 28, 2023
e4780e7
Draft query operator
espottesmith Apr 28, 2023
a2c5196
All looks good; just need a test for new query operator
espottesmith Apr 28, 2023
319f795
API tests pass
espottesmith Apr 28, 2023
96b7826
Getting rid of some lint
espottesmith Apr 29, 2023
b488456
Shut up, mypy!
espottesmith Apr 29, 2023
9fdffb9
More mypy suggestions. These ones are, admittedly, less bad
espottesmith Apr 29, 2023
f938a50
Once more for the road
espottesmith Apr 29, 2023
178f8e6
Can we please get rid of mypy
espottesmith Apr 29, 2023
ccea444
PLEASE
espottesmith Apr 29, 2023
e75742d
Missed something
espottesmith Apr 29, 2023
9a6dda2
Just ignoring everything
espottesmith Apr 29, 2023
8f32c03
More lint
espottesmith Apr 29, 2023
b0fa1df
It's taking me longer to get type checks to pass than it did for me t…
espottesmith Apr 29, 2023
06111d3
Lint
espottesmith Apr 29, 2023
30636db
Ahhhh mypy stop!
espottesmith Apr 29, 2023
d8cbe1c
Also this
espottesmith Apr 29, 2023
d18cb53
Did the creator of mypy hate us all?
espottesmith Apr 29, 2023
9df5039
Union
espottesmith Apr 29, 2023
9873c59
ThermoDocs with corrections weren't being validated with the way that…
espottesmith May 1, 2023
b74ba58
Was never passing kwargs along during building... this might explain …
espottesmith May 2, 2023
60a9091
Long story, but basically, this should resolve a long-standing and my…
espottesmith May 2, 2023
daa99fa
Index mismatch
espottesmith May 2, 2023
4a64f92
Fix solvent synonyms
espottesmith May 3, 2023
d54ec60
Small change requested by Orion
espottesmith May 3, 2023
0db13c2
Small tweaks to builder comments
espottesmith May 3, 2023
2b901b0
Merge branch 'main' into stats_bind_chiral
espottesmith May 3, 2023
2339831
Small change to NBO bonding
espottesmith May 5, 2023
fdb9628
Merge branch 'main' into stats_bind_chiral
espottesmith May 5, 2023
03782de
Merge branch 'main' into stats_bind_chiral
espottesmith May 5, 2023
287e6af
bugfix on metal bonding
espottesmith May 8, 2023
56252ab
Another bugfix; NBO metal bonding detection should work now?
espottesmith May 8, 2023
2ce5855
One more small bugfix
espottesmith May 8, 2023
562bb6e
Merge branch 'main' into stats_bind_chiral
espottesmith May 9, 2023
3044cb0
Now use pymatgen graph hashing; also fix hashing to use undirected gr…
espottesmith May 9, 2023
6d644ab
Update test
espottesmith May 9, 2023
af8936f
Merge branch 'main' into stats_bind_chiral
espottesmith May 9, 2023
0af317d
Revert test
espottesmith May 9, 2023
dedb9b8
Fix tests
espottesmith May 9, 2023
07b5ba9
Merge branch 'main' into stats_bind_chiral
espottesmith May 9, 2023
ca35d62
Seems I missed one test
espottesmith May 9, 2023
d614552
Include hash (and SMILES, cause it was already there) at the task level
espottesmith May 11, 2023
7182487
Merge branch 'main' into stats_bind_chiral
espottesmith May 11, 2023
52897c0
Merge branch 'main' into build_by_hash
espottesmith May 11, 2023
8622d62
Ready to build molecules (assoc and collection) based on hashes
espottesmith May 12, 2023
7f02cfd
Bugfix
espottesmith May 12, 2023
024348a
Merge branch 'main' into build_by_hash
espottesmith May 28, 2023
461c6d2
Merge branch 'build_by_hash' into stats_bind_chiral
espottesmith May 28, 2023
c374fef
Fix linting
espottesmith May 28, 2023
9f735f0
Linting JSON files, because that's apparently a good use of our time
espottesmith May 28, 2023
0765892
Trying to make linter happy with black
espottesmith May 28, 2023
44e6097
Pre-commit, don't fail me now
espottesmith May 28, 2023
4f8359a
mypy, my bitter enemy
espottesmith May 28, 2023
f4fa333
Merge branch 'main' into stats_bind_chiral
espottesmith Jun 10, 2023
3a12fb8
Man, I hate setting up these builder tests
espottesmith Jun 10, 2023
5438408
Linter fix
espottesmith Jun 10, 2023
67a5a51
Whoops, forgot to undo a change that I made for internal testing
espottesmith Jun 11, 2023
dda9686
Merge remote-tracking branch 'materialsproject/main' into stats_bind_…
espottesmith Jun 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions emmet-api/emmet/api/core/documentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,11 @@
"description": "Route for molecular bonding data. See the `MoleculeBondingDoc` schema for a full list \
of fields returned by this route."
},
{
"name": "Molecules Metal Binding",
"description": "Route for data regarding metal binding to molecules. See the `MetalBindingDoc` schema \
for a full list of fields returned by this route."
},
{
"name": "Molecules Orbitals",
"description": "Route for molecular orbital information obtained via Natural Bonding Orbital analysis. \
Expand Down
Empty file.
134 changes: 134 additions & 0 deletions emmet-api/emmet/api/routes/molecules/metal_binding/query_operators.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
from typing import Any, Optional, Union
from fastapi import Query
from maggma.api.query_operator import QueryOperator
from maggma.api.utils import STORE_PARAMS


class BindingDataQuery(QueryOperator):
"""
Method to generate a query on binding data.
"""

def query(
self,
metal_element: Optional[str] = Query(
None,
description="Element symbol for coordinated metal, e.g. 'Li' for lithium or 'Mg' for magnesium",
),
min_metal_partial_charge: Optional[float] = Query(
None,
description="Minimum metal partial charge."
),
max_metal_partial_charge: Optional[float] = Query(
None,
description="Maximum metal partial charge."
),
min_metal_partial_spin: Optional[float] = Query(
None,
description="Minimum metal partial spin (only meaningful for open-shell systems)."
),
max_metal_partial_spin: Optional[float] = Query(
None,
description="Maximum metal partial spin (only meaningful for open-shell systems)."
),
min_metal_assigned_charge: Optional[float] = Query(
None,
description="Minimum charge of the metal, determined by analyzing partial charges/spins."
),
max_metal_assigned_charge: Optional[float] = Query(
None,
description="Maximum charge of the metal, determined by analyzing partial charges/spins."
),
min_metal_assigned_spin: Optional[Union[int, float]] = Query(
None,
description="Minimum spin multiplicity of the metal, determined by analyzing partial spins."
),
max_metal_assigned_spin: Optional[Union[int, float]] = Query(
None,
description="Maximum spin multiplicity of the metal, determined by analyzing partial spins."
),
min_number_coordinate_bonds: Optional[int] = Query(
None,
description="Minimum number of atoms coordinated to the metal."
),
max_number_coordinate_bonds: Optional[int] = Query(
None,
description="Maximum number of atoms coordinated to the metal."
),
min_binding_energy: Optional[float] = Query(
None,
description="Minimum binding electronic energy (units: eV)"
),
max_binding_energy: Optional[float] = Query(
None,
description="Maximum binding electronic energy (units: eV)"
),
min_binding_enthalpy: Optional[float] = Query(
None,
description="Minimum binding enthalpy (units: eV)"
),
max_binding_enthalpy: Optional[float] = Query(
None,
description="Maximum binding enthalpy (units: eV)"
),
min_binding_entropy: Optional[float] = Query(
None,
description="Minimum binding entropy (units: eV/K)"
),
max_binding_entropy: Optional[float] = Query(
None,
description="Maximum binding entropy (units: eV/K)"
),
min_binding_free_energy: Optional[float] = Query(
None,
description="Minimum binding free energy (units: eV)"
),
max_binding_free_energy: Optional[float] = Query(
None,
description="Maximum binding free energy (units: eV)"
)
) -> STORE_PARAMS:

crit: Dict[str, Any] = dict() # type: ignore

if metal_element:
crit["binding_data.metal_element"] = metal_element

d = {
"metal_partial_charge": [min_metal_partial_charge, max_metal_partial_charge],
"metal_partial_spin": [min_metal_partial_spin, max_metal_partial_spin],
"metal_assigned_charge": [min_metal_assigned_charge, max_metal_assigned_charge],
"metal_assigned_spin": [min_metal_assigned_spin, max_metal_assigned_spin],
"number_coordinate_bonds": [min_number_coordinate_bonds, max_number_coordinate_bonds],
"binding_energy": [min_binding_energy, max_binding_energy],
"binding_enthalpy": [min_binding_enthalpy, max_binding_enthalpy],
"binding_entropy": [min_binding_entropy, max_binding_entropy],
"binding_free_energy": [min_binding_free_energy, max_binding_free_energy]
}

for entry in d:
key = "binding_data." + entry
if d[entry][0] is not None or d[entry][1] is not None: # type: ignore
crit[key] = dict()

if d[entry][0] is not None: # type: ignore
crit[key]["$gte"] = d[entry][0] # type: ignore

if d[entry][1] is not None: # type: ignore
crit[key]["$lte"] = d[entry][1] # type: ignore

return {"criteria": crit}

def ensure_indexes(self): # pragma: no cover
return [
("binding_data.metal_element", False),
("binding_data.metal_partial_charge", False),
("binding_data.metal_partial_spin", False),
("binding_data.metal_assigned_charge", False),
("binding_data.metal_assigned_spin", False),
("binding_data.number_coordinate_bonds", False),
("binding_data.binding_energy", False),
("binding_data.binding_enthalpy", False),
("binding_data.binding_entropy", False),
("binding_data.binding_free_energy", False),
]
54 changes: 54 additions & 0 deletions emmet-api/emmet/api/routes/molecules/metal_binding/resources.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
from maggma.api.resource import ReadOnlyResource
from emmet.core.molecules.metal_binding import MetalBindingDoc

from maggma.api.query_operator import PaginationQuery, SortQuery, SparseFieldsQuery

from emmet.api.routes.molecules.molecules.query_operators import (
MultiMPculeIDQuery,
ExactCalcMethodQuery,
FormulaQuery,
ChemsysQuery,
ElementsQuery,
ChargeSpinQuery
)
from emmet.api.routes.molecules.metal_binding.query_operators import BindingDataQuery
from emmet.api.routes.molecules.utils import MethodQuery, MultiPropertyIDQuery
from emmet.api.core.settings import MAPISettings
from emmet.api.core.global_header import GlobalHeaderProcessor


def metal_binding_resource(metal_binding_store):
resource = ReadOnlyResource(
metal_binding_store,
MetalBindingDoc,
query_operators=[
MultiMPculeIDQuery(),
ExactCalcMethodQuery(),
FormulaQuery(),
ChemsysQuery(),
ElementsQuery(),
ChargeSpinQuery(),
MethodQuery(),
BindingDataQuery(),
MultiPropertyIDQuery(),
SortQuery(),
PaginationQuery(),
SparseFieldsQuery(
MetalBindingDoc,
default_fields=[
"molecule_id",
"property_id",
"solvent",
"method",
"last_updated"
],
),
],
header_processor=GlobalHeaderProcessor(),
tags=["Molecules Metal Binding"],
sub_path="/metal_binding/",
disable_validation=True,
timeout=MAPISettings().TIMEOUT,
)

return resource
Original file line number Diff line number Diff line change
Expand Up @@ -415,7 +415,7 @@ def query(
),
max_energy_difference: Optional[float] = Query(
None,
description="Minimum energy difference between interacting orbitals"
description="Maximum energy difference between interacting orbitals"
),
min_fock_element: Optional[float] = Query(
None,
Expand Down
33 changes: 14 additions & 19 deletions emmet-api/emmet/api/routes/molecules/redox/query_operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,50 +11,45 @@ class RedoxPotentialQuery(QueryOperator):

def query(
self,
electrode: str = Query(
"H",
description="Reference electrode to be queried (e.g. 'H', 'Li', 'Mg')."
),
min_reduction_potential: Optional[float] = Query(
None,
description="Minimum reduction potential using the selected reference electrode."
description="Minimum reduction potential."
),
max_reduction_potential: Optional[float] = Query(
None,
description="Maximum reduction potential using the selected reference electrode."
description="Maximum reduction potential."
),
min_oxidation_potential: Optional[float] = Query(
None,
description="Minimum oxidation potential using the selected reference electrode."
description="Minimum oxidation potential."
),
max_oxidation_potential: Optional[float] = Query(
None,
description="Maximum oxidation potential using the selected reference electrode."
description="Maximum oxidation potential."
),
) -> STORE_PARAMS:

crit: Dict[str, Any] = dict() # type: ignore

d = {
"oxidation_potentials": [min_oxidation_potential, max_oxidation_potential],
"reduction_potentials": [min_reduction_potential, max_reduction_potential]
"oxidation_potential": [min_oxidation_potential, max_oxidation_potential],
"reduction_potential": [min_reduction_potential, max_reduction_potential]
}

for entry in d:
key = entry + "." + electrode
if d[entry][0] is not None or d[entry][1] is not None:
for key in d:
if d[key][0] is not None or d[key][1] is not None:
crit[key] = dict()

if d[entry][0] is not None:
crit[key]["$gte"] = d[entry][0]
if d[key][0] is not None:
crit[key]["$gte"] = d[key][0]

if d[entry][1] is not None:
crit[key]["$lte"] = d[entry][1]
if d[key][1] is not None:
crit[key]["$lte"] = d[key][1]

return {"criteria": crit}

def ensure_indexes(self): # pragma: no cover
return [
("oxidation_potentials", False),
("reduction_potentials", False),
("oxidation_potential", False),
("reduction_potential", False),
]
48 changes: 41 additions & 7 deletions emmet-api/emmet/api/routes/molecules/tasks/query_operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,7 @@
from fastapi import Query
from typing import Optional


# TODO: might need these utils once pmg changes are in place (see below)
# from emmet.api.routes.tasks.utils import calcs_reversed_to_trajectory, task_to_entry
from emmet.api.routes.molecules.tasks.utils import calcs_reversed_to_trajectory


class MultipleTaskIDsQuery(QueryOperator):
Expand Down Expand Up @@ -95,8 +93,44 @@ def post_process(self, docs, query):
return d


# TODO: class TrajectoryQuery(QueryOperator):
# Need to write Trajectory class in pmg for Molecules
class TrajectoryQuery(QueryOperator):
"""
Method to generate a query on calculation trajectory data from task documents
"""

def query(
self,
task_ids: Optional[str] = Query(
None, description="Comma-separated list of task_ids to query on"
),
) -> STORE_PARAMS:

crit = {}

if task_ids:
crit.update(
{
"task_id": {
"$in": [task_id.strip() for task_id in task_ids.split(",")]
}
}
)

return {"criteria": crit}

def post_process(self, docs, query):
"""
Post processing to generatore trajectory data
"""

# TODO: class EntryQuery(QueryOperator):
# Need to write MoleculeEntry class in pmg
d = [
{
"task_id": doc["task_id"],
"trajectories": jsanitize(
calcs_reversed_to_trajectory(doc["calcs_reversed"])
),
}
for doc in docs
]

return d
21 changes: 16 additions & 5 deletions emmet-api/emmet/api/routes/molecules/tasks/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,12 @@
from emmet.api.routes.molecules.tasks.query_operators import (
DeprecationQuery,
MultipleTaskIDsQuery,
# TODO:
# TrajectoryQuery,
TrajectoryQuery,
# EntryQuery,
)
from emmet.api.core.global_header import GlobalHeaderProcessor
from emmet.api.core.settings import MAPISettings
from emmet.core.tasks import DeprecationDoc
from emmet.core.tasks import DeprecationDoc, TrajectoryDoc
from emmet.core.qchem.task import TaskDocument

timeout = MAPISettings().TIMEOUT
Expand Down Expand Up @@ -65,5 +64,17 @@ def task_deprecation_resource(task_store):
return resource


# TODO: def trajectory_resource(task_store):
# TODO: def entries_resource(task_store):
def trajectory_resource(task_store):
resource = ReadOnlyResource(
task_store,
TrajectoryDoc,
query_operators=[TrajectoryQuery(), PaginationQuery()],
key_fields=["task_id", "calcs_reversed"],
tags=["Molecules Tasks"],
sub_path="/tasks/trajectory/",
header_processor=GlobalHeaderProcessor(),
timeout=timeout,
disable_validation=True,
)

return resource
Loading