Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChemEnv integration #595

Closed
JaGeo opened this issue Nov 23, 2022 · 14 comments
Closed

ChemEnv integration #595

JaGeo opened this issue Nov 23, 2022 · 14 comments
Assignees

Comments

@JaGeo
Copy link
Member

JaGeo commented Nov 23, 2022

As mentioned by mail, I would like to provide the ChemEnv data, @munrojm.

I am currently testing several options:

  • full run based on current implementation (12 hours on 24 cores for me, our system has a lot of memory available)
  • coordination environments only for structures which have oxidation states (5 hours on 24 cores)
  • coordination environments only for structures which have less than or equal to 100 inequivalent sites in the structure to reduce the run times
  • combination of both options

I will add run times and provide potential implementations in the form of a pull request in a next step. We should potentially test the fastest option first and then see if this is doable on your systems and potentially extend the number of compounds complexity step by step.

@JaGeo
Copy link
Member Author

JaGeo commented Nov 23, 2022

I think I will just make a pull request for the option where we only do the analysis when oxidation states are available. I will comment the other code for now. It runs at least twice as fast and should also be easier to understand for users if only one type of ChemEnv analysis is used.

@munrojm
Copy link
Member

munrojm commented Nov 28, 2022

@JaGeo, just pinging you here to let you know I am debugging some builder pipeline issues which is delaying me on getting the chemenv stage up and running. This should be done soon, and I will update this thread.

@JaGeo
Copy link
Member Author

JaGeo commented Nov 28, 2022

@munrojm Thanks for the update! 🙂

@munrojm
Copy link
Member

munrojm commented Nov 30, 2022

@JaGeo, the builder runs well! I am able to get it completed in ~5hrs within our build pipelines. The data should be incorporated into the DB and available through the API next release.

@JaGeo
Copy link
Member Author

JaGeo commented Nov 30, 2022

@munrojm Awesome! 🥳 Thank you so much!

@munrojm munrojm closed this as completed Dec 1, 2022
@JaGeo
Copy link
Member Author

JaGeo commented Feb 7, 2023

@munrojm If there is more documentation needed to make this available, just let me know. I am happy to help with this as well.

@munrojm
Copy link
Member

munrojm commented Feb 8, 2023

@JaGeo, I appreciate the offer. Apologies again for the delay on this. This is essentially ready to go on the API side. Our initial intent was to wait to bundle this with some other data in the next DB release, but that kept getting pushed back for a number of reasons. Currently, I am working to fix up thermo data related issues. We intend to deploy again soon with to accommodate the changes. What I will do is include the ChemEnv data in that deployment so it is available next week.

@munrojm munrojm reopened this Feb 8, 2023
@JaGeo
Copy link
Member Author

JaGeo commented Feb 8, 2023

Great, thank you!

@JaGeo
Copy link
Member Author

JaGeo commented Feb 9, 2023

I do have an additional question: would it be in principle possible to get a DOI for such a dataset (the whole one)?

@munrojm
Copy link
Member

munrojm commented Feb 10, 2023

@JaGeo The DOIs we get from OSTI need to be matched to a landing page. Dataset contributions to core and not MPContribs don't have that, and thus we can't assign DOIs for them. @tschaume can weigh in on this in case I am getting something wrong.

On another note, do you have any particular things you would like to have included as filters on the chemenv data within the API? chemenv_iucr for example?

@JaGeo
Copy link
Member Author

JaGeo commented Feb 15, 2023

@munrojm I think a filter on the environment names would be good: chemenv_iucr, chemenv_name, chemenv_iupac and maybe also a filter on the "continous symmetry measure" (csm) to exclude very distorted environments,, for example. We also already show some outputs from ChemEnv on the materialsproject toolkit page. There, we also display wyckoff_positions, and species and the name chemenv_name.

@tschaume tschaume self-assigned this Feb 22, 2023
@munrojm
Copy link
Member

munrojm commented Apr 24, 2023

@JaGeo, feel free to try pulling the data with the latest MPRester and let me know what other changes you would like to make. We are going to be working on website integration soon.

@JaGeo
Copy link
Member Author

JaGeo commented Apr 24, 2023

Thank you, @munrojm ! I will try to test it during this week!

@JaGeo
Copy link
Member Author

JaGeo commented Apr 24, 2023

@munrojm, I was too curious and I added some suggestions here: materialsproject/api#771

@munrojm munrojm closed this as completed Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants