-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q-Chem Bug: There should not be a check on supported basis sets and functionals #933
Comments
Note that a similar error will occur with solvation methods here. |
Thanks for flagging this @Andrew-S-Rosen . I just had a discussion with @espottesmith and not raising the Having said that, we definitely need to expand the currently accepted list of functionals/basis-sets. There is no reason for So I think the best option right now(as a first step) would be to agree on a list of say 25(choose your number) widely accepted/used functionals/basis/solvents which hopefully captures most QChem calculations. Tagging in @rkingsbury and @samblau for their thoughts on this |
I would strongly advise against this decision (tagging @espottesmith for discussion). If the emmet taskdoc is used outside of the Molecules App (like in atomate2), it would make Q-Chem inflexible for users. There is no hard-coded list that would be satisfactory. Who is to say what is appropriate? For instance, M06-L isn't a functional you all use, but it's one I use a lot. Or how about a random basis set? When new functionals or basis sets are added to Q-Chem, one would also have to manually update the list each time. Then there's the philosophical question of --- why does it even matter? The level of theory should just be the functional, basis set, and solvent accepted as-is. If a nonsensical value were provided, Q-Chem would not have ran in the first place. If the goal is to only use this task document for MP purposes, then the |
The core problem here is that we need a solution which is reasonable both for building a standardized database for MP and for end users. I hear you that no predefined list is going to capture all use cases of all users, but it's absolutely essential for database construction (to your "why does it matter", we need to be able to rank calculations to select the "best" one. If we encounter a functional that we haven't seen before, then how do we rank it?) And this isn't just true on the molecules side - the materials side of MP also has a predefined list of functionals, and it ranks these to select the best properties. If we have to choose between being inflexible to users or not being able to validate/rank on the DB side, I'd heavily lean in favor of being inflexible. That said, I don't think that MP and atomate2 necessarily need to use the same schema with the same validation in place. After all, the emmet task doc, with validation, was built on the original atomate task doc, which allowed basically anything that you could throw at Q-Chem. |
I will have to do some digging on the materials side because, to the best of my knowledge, the VASP task document does not do any MP-related validation in construction of the task doc itself (beyond checking to make sure the calculation actually completed). This ensures that atomate2 is fully flexible. I believe there is a separate ValdationDoc. I do agree that there are two objectives at odds with one another here. At some point (materialsproject/foundation#2), it was decided that task documents would start out in atomate2 and then, when they mature, they could eventually be in emmet (which atomate2 calls as a dependency). Of course, we kind of skipped that step since we went atomate1 --> emmet, but regardless, that philosophy can naturally introduce some conflicts, like that discussed here. Is there any reason why the ranking needs to take place during construction of the task doc? To me, it seems like it would make more sense to rank the functional/basis set/solvent in whatever code is calling the task doc to curate data for MP (I imagine that would be in In any case, for me, it doesn't really matter, but this is ultimately going to be an issue for atomate2 if not appropriately addressed. Users need to be able to run whatever method that best suits their system, no matter how obscure they are to the MP team. |
I'm also not sure at what point the validation happens, and I myself don't use atomate2 (I've never really been convinced that it should exist), but my code was heavily inspired by what was already in place in emmet. Ranking does not take place during the construction of the task doc. It occurs in emmet (not web), starting from the MoleculesAssociationBuilder. But regardless of where the ranking takes place, if any task doc can't be ranked (at the point of building), then that raises problems. It's totally possible for us to not validate at the point of task creation and only later on, but really that's just passing the buck. I know of several users (including @samblau and @rdguha1995) who have used builders to process collections of task documents, even if the output collections weren't intended for MP. If we don't validate the task docs but do validate when building, the outcome is the same - the user's calculations can't be used as expected. For your particular use case, maybe this is acceptable, and maybe we let task docs be constructed with no validation and then validate only later on. But I don't think that resolves the larger issue here. |
Maybe the solution in that scenario is to allow for any functional/basis set/solvent, but if it is "unknown" to emmet, it would not have an associated ranking and when that is needed in |
@rdguha1995 brought up the "lowest possible priority" option. That's acceptable from the MP perspective, but not great from the user perspective. I actually think it's worse than failing validation, because it's basically failing silently. Let's say you do some calculations with some fancy functional (wB97M(2), whatever) that wasn't part of our ranking scheme. A user who isn't intimately familiar with emmet would be reasonably confused why their calculations are never chosen when building property documents, even when they're clearly the best tasks available. Having the error raised in the association builder - or just not having tasks allowed during building - is more acceptable, but as I just said, there's still a problem where users can't use the builders because their tasks aren't part of the allowed sets. The only benefit of this is that the task document exists in one case, and doesn't in the other. That's something, but this is still a problem. |
This was my proposed solution as well @Andrew-S-Rosen , but then the users would have to make peace with the fact that if they intend to add their calculations to the DB (it would not show up). Something which might be an option (again not ideal), is just to have two separated If you want your calculations to go in MP use the |
Having two separate classes seems like it'd add a decent amount of bloat, but I think having "validate" as a flag during task doc creation could be reasonable. If True, then you check things like level of theory; if False, you don't care. Either way, it could populate a "validated" field which the builders can check/query against. I have a sinking suspicion that this would end up with piles of unvalidated tasks that people eventually want to add to MP, but I guess that's a bridge to burn another day. |
In the short-term, I do think a |
Certainly a worthy discussion! My $0.02 is the following:
|
Thanks for the inputs everyone! @rkingsbury thanks for tagging the |
Sounds like a great plan to me (near term and long term) @rdguha1995 . That way the "valid" calc types can be defined by external users, either by excluding the |
This has now been addressed in PR #970. A The warning was not completely necessary but I just added it for good measure. |
Problem
If you try to build a Q-Chem task doc from a calculation that uses a functional or basis set beyond those that have been pre-tabulated, the task doc can't be made. This is extremely limiting. Tagging @rdguha1995.
In my instance, I was running a calculation with the def2-SVP basis set, and it crashed on parsing.
Proposed Solution
Here is the source of the issue:
Errors should not be raised here. There is no limit on what users might want to use for the level of theory. The list of functionals and basis sets are here. The functional and basis set should just be accepted from the user as-is and not be hard-coded.
Alternatives
No response
The text was updated successfully, but these errors were encountered: