Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create 0002-scope-of-emmet-models.md #2

Merged
merged 4 commits into from
Mar 6, 2023
Merged

Create 0002-scope-of-emmet-models.md #2

merged 4 commits into from
Mar 6, 2023

Conversation

rkingsbury
Copy link
Collaborator

No description provided.

@munrojm
Copy link
Member

munrojm commented Nov 21, 2022

  1. The atomate2 documents contain some fields and changed field names compared to atomate1

Just to clarify, the changes are now simply additional fields added to the model?

  1. atomate1 task documents use task_id whereas atomate2 documents use UUID as identifiers.

At a minimum, there needs to be an additional field with the task_id name. This will let us populate it at calculation ingestion with whatever tagging scheme is decided upon.

  1. Many atomate2 document models contain logic for instantiating from raw calculation
    outputs, e.g. from_vasp_files(). Should these live in the emmet document models, or
    elsewhere?
  2. The methods in item 3) may create dependency bloat. How to handle?

This is fine with optional dependencies.

@JaGeo
Copy link
Member

JaGeo commented Nov 22, 2022

I would still like to discuss the scope or relevance of emmet-core to users that would just like to use or build automations based on atomate2. Maybe we should clarify the documentation of emmet and atomate2 to educate users about when a scheme should exit in emmet-core and in which case not. This is of course up to the maintainers to decide but I think it might help new users/developers. :)

@mkhorton
Copy link
Member

Maybe we should clarify the documentation of emmet and atomate2 to educate users about when a scheme should exit in emmet-core and in which case not.

Yes, I think this was @utf's point too -- to make progress on this issue, we really need to overhaul the emmet README and add documentation.

On this point, and it's a more dramatic change, I think there is a good case to be made that emmet should be renamed materialsproject -- it really is more of a Materials Project meta package, and materialsproject-core makes more sense to me. "Emmet" is a wonderful name, but is really more of an internal project name in my eyes.

@utf
Copy link
Member

utf commented Nov 23, 2022

I like the idea of emmet -> materialsproject. That also helps to clarify the project scope.

@jmmshn
Copy link
Collaborator

jmmshn commented Nov 23, 2022

FYI: A more practical point about merging atomate1 and atomat2 tasks:
https://matsci.org/t/proper-usage-of-name-field/45560

@mkhorton
Copy link
Member

@munrojm @tschaume as current Materials Project core staff, do you have views on a package rename? This got a few thumbs up.

@munrojm
Copy link
Member

munrojm commented Nov 23, 2022

I like that name change.

@tschaume
Copy link
Member

Though I like the name a lot, too, I think re-naming emmet into something more descriptive of MP makes sense if user adoption/confusion are an issue. As of now, the emmet repo contains the namespace packages *-{core,api,builders,cli}. Since materialsproject is very generic and also our organization name, we could consider renaming emmet to something like mpdata which would cause the namespace packages to be released as mpdata-core, mpdata-api, etc.

@munrojm
Copy link
Member

munrojm commented Nov 23, 2022

I also like @tschaume's suggestion.

@JaGeo
Copy link
Member

JaGeo commented Nov 23, 2022

mpdata sounds good as well.

@shyuep
Copy link
Member

shyuep commented Nov 23, 2022

I am very non-invested in this. But I think something simpler and more informative like mpdb? As a CLI, it is easier to use too. mpdata sounds unwieldy and does not describe what the tools actually do. Also, the sub packages like mpdb-builder is then perfectly aligned with the intended use.

@jmmshn
Copy link
Collaborator

jmmshn commented Nov 24, 2022

On this point, and it's a more dramatic change, I think there is a good case to be made that emmet should be renamed materialsproject -- it really is more of a Materials Project meta package, and materialsproject-core makes more sense to me. "Emmet" is a wonderful name, but is really more of an internal project name in my eyes.

I think changing the package name should only be done as a last resort since I'm sure there are many students still feeling their way around emmet and changing it will be frustrating.
The fact that emmet-core is for core materials project stuff can just be documented.
And for the name itself, I think our field is full of historical names (I think ***CAR is cartesian??) that are kept around. The people that are invested enough to be looking at emmet are not going to be confused by a slightly quirky name.

@jmmshn
Copy link
Collaborator

jmmshn commented Nov 24, 2022

  • Many atomate2 document models contain logic for instantiating from raw calculation
    outputs, e.g. from_vasp_files(). Should these live in the emmet document models, or
    elsewhere?

  • The methods in item 3) may create dependency bloat. How to handle?

Which documents go into emmet can be decided on a case-by-case base.
Although I think that will be obvious to the person writing each one.
I think methods should go in emmet as much as possible and if atomate2 just imports from emmet and maybe add a field here or there.
Since we usually try to make emmet provenance agnostic I imaging atomate2 will only need to make a couple of minor tweaks.

@rkingsbury
Copy link
Collaborator Author

Circling back to @JaGeo 's point - I think it's very important to send a clear message to atomate2 developers about what should and should not go in emmet. Especially if we rename, but even if not, it seems clear that emmet exists first and foremost to power MP and therefore new workflow stuff may not belong here if it is not used by MP.

I think methods should go in emmet as much as possible and if atomate2 just imports from emmet and maybe add a field here or there.
Since we usually try to make emmet provenance agnostic I imaging atomate2 will only need to make a couple of minor tweaks.

On this point, I think it makes a lot of sense for any / all provenance-agnostic document models to live in emmet. TaskDocument specific to particular codes (Q-Chem, VASP, etc.) should be decided on a case by case basis.

In general, I think we should recommend atomate2 developers building workflows for new codes (i.e., NOT VASP or Q-Chem) to develop their TaskDocument in atomate2. This lowers development friction and will give new doc models a chance to incubate for a while before we need to decide whether to adopt in emmet or not.

@mkhorton
Copy link
Member

mkhorton commented Dec 14, 2022

In general, I think we should recommend atomate2 developers building workflows for new codes (i.e., NOT VASP or Q-Chem) to develop their TaskDocument in atomate2.

Does it then sub-class the base TaskDocument in emmet?


Regarding a rename, I think there's consensus that a rename is good idea (noting dissent from @jmmshn), but it's not clear which name is best -- perhaps I'll send round a quick poll and we can make a decision.

@utf
Copy link
Member

utf commented Feb 6, 2023

I have updated the document with the decision outcome.

@rkingsbury
Copy link
Collaborator Author

I have updated the document with the decision outcome.

Thanks @utf ! Your outcome says Option 2, but I think what you describe is Option 1?

@utf
Copy link
Member

utf commented Feb 6, 2023

Good spot. I've updated it.

@munrojm munrojm merged commit d878c04 into main Mar 6, 2023
@janosh janosh deleted the emmet-scope branch March 6, 2023 19:22
@Andrew-S-Rosen
Copy link
Member

One downside of this decision is that if you want to use the schemas that aren't in emmet-core in your own package, you have to make Atomate2 a dependency even if you don't need the various workflow sub-dependencies. But that's a price to pay I suppose!

@janosh
Copy link
Member

janosh commented Jun 3, 2023

@arosen93 You can pip install atomate2 --no-deps which will only install the package itself, none of its dependencies. You can then figure out the minimal set of deps that need to be importable for you to import just the schemas.

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Jun 3, 2023

You are a hero and magician!!!

Edit: Came to the same conclusion as you below, unfortunately, that it's not really a practical solution from a package management standpoint. You're still a hero and magician regardless.

@janosh
Copy link
Member

janosh commented Jun 3, 2023

I was just looking for a way to specify --no-deps in pyproject.toml. I thought that's possible but maybe I'm wrong. Can't find it now. So for a package like quacc, maybe this isn't actually a solution.

@rkingsbury
Copy link
Collaborator Author

rkingsbury commented Jun 7, 2023

One downside of this decision is that if you want to use the schemas that aren't in emmet-core in your own package, you have to make Atomate2 a dependency even if you don't need the various workflow sub-dependencies. But that's a price to pay I suppose!

Yeah that's true; I've run into a similar challenge with one of my packages. If the schemas you need are relatively few and/or relatively static, perhaps copy/pasting from atomate2 with an appropriate acknowledgement comment could be a solution? Might as well take advantage of everything being open source and permissively licensed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants