Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for improvement on IonDB formula recognition #136

Closed
xiaoxiaozhu123 opened this issue May 16, 2024 · 4 comments · Fixed by #145
Closed

Suggestion for improvement on IonDB formula recognition #136

xiaoxiaozhu123 opened this issue May 16, 2024 · 4 comments · Fixed by #145

Comments

@xiaoxiaozhu123
Copy link

xiaoxiaozhu123 commented May 16, 2024

pyEQL.utils.standarize_formula currently uses pymatgen.Ion.reduced_formula to standarize the formulas of ions.
Although this package has included some special handlings of formulas such as acetate (CH3COO3[-1]), there are still some commonly used formulas modified by the func standerize_formula that are not universally accepted.
Here are some examples and suggestions:
pyEQL standardize formula - Jupyter Notebook

@rkingsbury
Copy link
Member

Excellent suggestions, thank you for reporting @xiaoxiaozhu123 . One of the reasons that I chose to implement standardize_formula rather than just directly use the underyling pymatgen method is to give us the ability, within pyEQL to customize handling of special cases (like these), in case it doesn't make sense to change the handling upstream.

So for the examples you mention, I think we should modify standardize_formula to simply override the output of Ion.composition.reduced_formula and return the expected formulae. After that, we can also see about adding these rules to pymatgen upstream. If they are merged, we can later remove the rules from pyEQL.

Would be comfortable opening a PR in pyEQL to implement these changes? If not, could you write code block and paste it here that I can add to standardize_formula (basically a series of if statements), e.g.

if rform == 'H4N[+1]'
    rform = `NH4[+1]
elif rform == ....
...

@rkingsbury
Copy link
Member

Changing the behavior of query will be more difficult, but perhaps I could take a similar approach as with standardize_formula, where I implement a custom query method within pyEQL that wraps the underlying query from maggma, but first standardizes any formulae.

@rkingsbury
Copy link
Member

@xiaoxiaozhu123 I was just wondering if you had any additional suggested fixes (besides H4N)? I will try to implement a fix for this soon and I'd like to incorporate any other odd formulas at the same time.

@rkingsbury
Copy link
Member

Fixed in v1.0.2, now available via pip. Thanks again for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants