Merge remote-tracking branch 'upstream/master' into str-fixes

materialsproject · Oct 29, 2021 · b180cac · b180cac
2 parents a632d7c + acfe589
commit b180cac
Show file tree

Hide file tree

Showing 5 changed files with 128 additions and 69 deletions.
diff --git a/ADMIN.rst b/ADMIN.rst
@@ -1,7 +1,7 @@
 Introduction
 ============
 
-This docmentation provides a guide for pymatgen administrators. The following 
+This documentation provides a guide for pymatgen administrators. The following
 assumes you are using miniconda or Anaconda.
 
 Releases
@@ -23,18 +23,18 @@ Install some conda tools first::
 	conda install --yes conda-build anaconda-client
 	conda config --add channels matsci
 
-Pymatgen uses `invoke <http://www.pyinvoke.org/>`_ to automate releases. You will 
+Pymatgen uses `invoke <http://www.pyinvoke.org/>`_ to automate releases. You will
 also need sphinx and doc2dash. Install these using::
 
 	pip install --upgrade invoke sphinx doc2dash
 
-For 2018, we will release both py27 and py37 versions of pymatgen. Create 
+For 2018, we will release both py27 and py37 versions of pymatgen. Create
 environments for py27 and py37 using conda::
 
 	conda create --yes -n py37 python=3.7
 	conda create --yes -n py27 python=2.7
 
-For each env, install some packages using conda followed by dev install for 
+For each env, install some packages using conda followed by dev install for
 pymatgen::
 
 	conda activate py37
@@ -50,43 +50,43 @@ pymatgen::
 	pip install invoke sphinx doc2dash
 	python setup.py develop
 
-Add your PyPI username and password and GITHUB_RELEASE_TOKEN into your 
+Add your PyPI username and password and GITHUB_RELEASE_TOKEN into your
 environment::
 
 	export TWINE_USERNAME=PYPIUSERNAME
 	export TWINE_PASSWORD=PYPIPASSWORD
 	export GITHUB_RELEASES_TOKEN=TOKEN_YOU_GET_FROM_GITHUB
 
-You may want to add these to your .bash_profile to avoid having to type these 
+You may want to add these to your .bash_profile to avoid having to type these
 each time.
 
 Machine-specific issues
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-The above instructions are general, but there are some known issues that are 
+The above instructions are general, but there are some known issues that are
 machine-specific:
 
-* Installing lxml via pip required `STATIC_DEPS=true pip install lxml` on 
+* Installing lxml via pip required `STATIC_DEPS=true pip install lxml` on
   macOS 10.13.
-* It can be useful to `pip install --upgrade pip twine setuptools` (this may 
+* It can be useful to `pip install --upgrade pip twine setuptools` (this may
   be necessary if there are authentication errors when connecting to PyPI).
 * You may have to `brew install hdf5 netcdf` or similar to be able to pip
   install the optional requirement `netCDF4`.
 
 Doing the release
 -----------------
 
-Ensure appropriate environment variabels are set including `DISCOURSE_API_USERNAME`,
+Ensure appropriate environment variables are set including `DISCOURSE_API_USERNAME`,
 `DISCOURSE_API_KEY` and `GITHUB_RELEASES_TOKEN`.
 
-First update the change log. The autogenerated change log is simply a list of 
+First update the change log. The autogenerated change log is simply a list of
 commit messages since the last version.  Make sure to edit the log for brevity
 and to attribute significant features to appropriate developers::
 
     conda activate py37
     invoke update-changelog
 
-Then, do the release with the following sequence of commands (you can put them 
+Then, do the release with the following sequence of commands (you can put them
 in a bash script in your PATH somewhere)::
 
     conda activate py37
@@ -97,7 +97,7 @@ in a bash script in your PATH somewhere)::
     python setup.py develop
 
 Double check that the releases are properly done on Pypi. If you are releasing
-on a Mac, you should see a pymatgen.version.tar.gz and two wheels (Py37 and 
+on a Mac, you should see a pymatgen.version.tar.gz and two wheels (Py37 and
 P). There will be a py37 wheel for Windows that is generated by Appveyor.
 
 Materials.sh

diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
@@ -76,9 +76,9 @@ http://www.eqqon.com/index.php/Collaborative_Github_Workflow):
    pymatgen maintainers. They will pull your commits and run their own tests
    before releasing.
 
-"Work-in-progress" pull requests are encouraged, especially if this is your 
-first time contributing to pymatgen, and the maintainers will be happy to 
-help or provide code review as necessary. Put "[WIP]" in the title of your 
+"Work-in-progress" pull requests are encouraged, especially if this is your
+first time contributing to pymatgen, and the maintainers will be happy to
+help or provide code review as necessary. Put "[WIP]" in the title of your
 pull request to indicate it's not ready to be merged.
 
 Coding Guidelines
@@ -88,7 +88,7 @@ Given that pymatgen is intended to be long-term code base, we adopt very strict
 quality control and coding guidelines for all contributions to pymatgen. The
 following must be satisfied for your contributions to be accepted into pymatgen.
 
-1. **Unittests** are required for all new modules and methods. The only way to
+1. **Unit tests** are required for all new modules and methods. The only way to
    minimize code regression is to ensure that all code are well-tested. If the
    maintainer cannot test your code, the contribution will be rejected.
 2. **Python PEP 8** `code style <http://www.python.org/dev/peps/pep-0008/>`_.
@@ -101,9 +101,9 @@ following must be satisfied for your contributions to be accepted into pymatgen.
    prior to any commits. At the very least, copy pre-commit to .git/hooks/pre-push.
 3. **Python 3**. We only support Python 3.7+.
 4. **Documentation** required for all modules, classes and methods. In
-   particular, the method docstrings should make clear the arguments expected
+   particular, the method doc strings should make clear the arguments expected
    and the return values. For complex algorithms (e.g., an Ewald summation), a
-   summary of the alogirthm should be provided, and preferably with a link to a
+   summary of the algorithm should be provided, and preferably with a link to a
    publication outlining the method in detail.
 5. **IDE**. We highly recommend the use of Pycharm. You should also set up
    pycodestyle and turn those on within the IDE setup. This will warn of any
@@ -116,10 +116,10 @@ examples of what is expected.
 A word on coding for Python 2 compatibility
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-As of 2019, pymatgen no longer requires code to be Python 2 compatible, and 
-current versions of the code are not supported with Python 2. If you need a 
-version of pymatgen that works with Python 2, please use a version before 
-2018, but note this will be missing the latest bug fixes. This change follows 
+As of 2019, pymatgen no longer requires code to be Python 2 compatible, and
+current versions of the code are not supported with Python 2. If you need a
+version of pymatgen that works with Python 2, please use a version before
+2018, but note this will be missing the latest bug fixes. This change follows
 the broader Python community no longer supporting Python 2, including numpy.
 
 .. _`pymatgen's Google Groups page`: https://groups.google.com/forum/?fromgroups#!forum/pymatgen/

diff --git a/README.rst b/README.rst
@@ -77,7 +77,7 @@ Why use pymatgen?
 1. **It is (fairly) robust.** Pymatgen is used by thousands of researchers, and is the analysis code powering the
    `Materials Project`_. The analysis it produces survives rigorous scrutiny every single day. Bugs tend to be
    found and corrected quickly. Pymatgen also uses Github Actions for continuous integration, which ensures that every
-   new code passes a comprehensive suite of unittests.
+   new code passes a comprehensive suite of unit tests.
 2. **It is well documented.** A fairly comprehensive documentation has been written to help you get to grips with it
    quickly.
 3. **It is open.** You are free to use and contribute to pymatgen. It also means that pymatgen is continuously being

diff --git a/pymatgen/core/composition.py b/pymatgen/core/composition.py
@@ -12,9 +12,10 @@
 import os
 import re
 import string
+import warnings
 from functools import total_ordering
 from itertools import combinations_with_replacement, product
-from typing import Dict, List, Tuple, Union
+from typing import Dict, Generator, List, Tuple, Union
 
 from monty.fractions import gcd, gcd_float
 from monty.json import MSONable
@@ -122,7 +123,7 @@ def __init__(self, *args, strict: bool = False, **kwargs):
         if len(args) == 1 and isinstance(args[0], Composition):
             elmap = args[0]
         elif len(args) == 1 and isinstance(args[0], str):
-            elmap = self._parse_formula(args[0])
+            elmap = self._parse_formula(args[0])  # type: ignore
         else:
             elmap = dict(*args, **kwargs)  # type: ignore
         elamt = {}
@@ -462,7 +463,8 @@ def __str__(self):
 
     def to_pretty_string(self) -> str:
         """
-        :return: Same as __str__ but without spaces.
+        Returns:
+            str: Same as output __str__() but without spaces.
         """
         return re.sub(r"\s+", "", self.__str__())
 
@@ -493,7 +495,7 @@ def get_atomic_fraction(self, el: SpeciesLike) -> float:
         """
         return abs(self[el]) / self._natoms
 
-    def get_wt_fraction(self, el: SpeciesLike):
+    def get_wt_fraction(self, el: SpeciesLike) -> float:
         """
         Calculate weight fraction of an Element or Species.
 
@@ -505,10 +507,7 @@ def get_wt_fraction(self, el: SpeciesLike):
         """
         return get_el_sp(el).atomic_mass * abs(self[el]) / self.weight
 
-    def contains_element_type(
-        self,
-        category: str,
-    ):
+    def contains_element_type(self, category: str) -> bool:
         """
         Check if Composition contains any elements matching a given category.
 
@@ -549,7 +548,7 @@ def contains_element_type(
             return any(category[0] in el.block for el in self.elements)
         return any(getattr(el, "is_{}".format(category)) for el in self.elements)
 
-    def _parse_formula(self, formula):
+    def _parse_formula(self, formula: str) -> Dict[str, float]:
         """
         Args:
             formula (str): A string formula, e.g. Fe2O3, Li3Fe2(PO4)3
@@ -564,22 +563,22 @@ def _parse_formula(self, formula):
         # for Metallofullerene like "Y3N@C80"
         formula = formula.replace("@", "")
 
-        def get_sym_dict(f, factor):
-            sym_dict = collections.defaultdict(float)
-            for m in re.finditer(r"([A-Z][a-z]*)\s*([-*\.e\d]*)", f):
+        def get_sym_dict(form: str, factor: Union[int, float]) -> Dict[str, float]:
+            sym_dict: Dict[str, float] = collections.defaultdict(float)
+            for m in re.finditer(r"([A-Z][a-z]*)\s*([-*\.e\d]*)", form):
                 el = m.group(1)
-                amt = 1
+                amt = 1.0
                 if m.group(2).strip() != "":
                     amt = float(m.group(2))
                 sym_dict[el] += amt * factor
-                f = f.replace(m.group(), "", 1)
-            if f.strip():
-                raise ValueError("{} is an invalid formula!".format(f))
+                form = form.replace(m.group(), "", 1)
+            if form.strip():
+                raise ValueError("{} is an invalid formula!".format(form))
             return sym_dict
 
         m = re.search(r"\(([^\(\)]+)\)\s*([\.e\d]*)", formula)
         if m:
-            factor = 1
+            factor = 1.0
             if m.group(2) != "":
                 factor = float(m.group(2))
             unit_sym_dict = get_sym_dict(m.group(1), factor)
@@ -619,7 +618,7 @@ def chemical_system(self) -> str:
         sorted alphabetically and joined by dashes, by convention for use
         in database keys.
         """
-        return "-".join(sorted([el.symbol for el in self.elements]))
+        return "-".join(sorted(el.symbol for el in self.elements))
 
     @property
     def valid(self) -> bool:
@@ -629,7 +628,7 @@ def valid(self) -> bool:
         """
         return not any(isinstance(el, DummySpecies) for el in self.elements)
 
-    def __repr__(self):
+    def __repr__(self) -> str:
         return "Comp: " + self.formula
 
     @classmethod
@@ -657,7 +656,7 @@ def get_el_amt_dict(self) -> Dict[str, float]:
             d[e.symbol] += a
         return d
 
-    def as_dict(self) -> dict:
+    def as_dict(self) -> Dict[str, float]:
         """
         Returns:
             dict with species symbol and (unreduced) amount e.g.,
@@ -735,6 +734,42 @@ def oxi_state_guesses(
 
         return self._get_oxid_state_guesses(all_oxi_states, max_sites, oxi_states_override, target_charge)[0]
 
+    def replace(self, elem_map: Dict[str, Union[str, Dict[str, Union[int, float]]]]) -> "Composition":
+        """
+        Replace elements in a composition. Returns a new Composition, leaving the old one unchanged.
+
+        Args:
+            elem_map (dict[str, str | dict[str, int | float]]): dict of elements or species to swap. E.g.
+                {"Li": "Na"} performs a Li for Na substitution. The target can be a {species: factor} dict. For
+                example, in Fe2O3 you could map {"Fe": {"Mg": 0.5, "Cu":0.5}} to obtain MgCuO3.
+
+        Returns:
+            Composition: New object with elements remapped according to elem_map.
+        """
+
+        # drop inapplicable substitutions
+        invalid_elems = [key for key in elem_map if key not in self]
+        if invalid_elems:
+            warnings.warn(
+                "Some elements to be substituted are not present in composition. Please check your input. "
+                f"Problematic element = {invalid_elems}; {self}"
+            )
+        for elem in invalid_elems:
+            elem_map.pop(elem)
+
+        new_comp = self.as_dict()
+
+        for old_elem, new_elem in elem_map.items():
+            amount = new_comp.pop(old_elem)
+
+            if isinstance(new_elem, dict):
+                for el, factor in new_elem.items():
+                    new_comp[el] = factor * amount
+            else:
+                new_comp[new_elem] = amount
+
+        return Composition(new_comp)
+
     def add_charges_from_oxi_state_guesses(
         self,
         oxi_states_override: dict = None,
@@ -938,7 +973,9 @@ def _get_oxid_state_guesses(self, all_oxi_states, max_sites, oxi_states_override
         return all_sols, all_oxid_combo
 
     @staticmethod
-    def ranked_compositions_from_indeterminate_formula(fuzzy_formula, lock_if_strict=True):
+    def ranked_compositions_from_indeterminate_formula(
+        fuzzy_formula: str, lock_if_strict: bool = True
+    ) -> List["Composition"]:
         """
         Takes in a formula where capitalization might not be correctly entered,
         and suggests a ranked list of potential Composition matches.
@@ -969,14 +1006,19 @@ def ranked_compositions_from_indeterminate_formula(fuzzy_formula, lock_if_strict
 
         all_matches = Composition._comps_from_fuzzy_formula(fuzzy_formula)
         # remove duplicates
-        all_matches = list(set(all_matches))
+        uniq_matches = list(set(all_matches))
         # sort matches by rank descending
-        all_matches = sorted(all_matches, key=lambda match: (match[1], match[0]), reverse=True)
-        all_matches = [m[0] for m in all_matches]
-        return all_matches
+        ranked_matches = sorted(uniq_matches, key=lambda match: (match[1], match[0]), reverse=True)
+
+        return [m[0] for m in ranked_matches]
 
     @staticmethod
-    def _comps_from_fuzzy_formula(fuzzy_formula, m_dict=None, m_points=0, factor=1):
+    def _comps_from_fuzzy_formula(
+        fuzzy_formula: str,
+        m_dict: Dict[str, float] = None,
+        m_points: int = 0,
+        factor: Union[int, float] = 1,
+    ) -> Generator[Tuple["Composition", int], None, None]:
         """
         A recursive helper method for formula parsing that helps in
         interpreting and ranking indeterminate formulas.
@@ -993,9 +1035,8 @@ def _comps_from_fuzzy_formula(fuzzy_formula, m_dict=None, m_points=0, factor=1):
                 as the fuzzy_formula with a coefficient of 2.
 
         Returns:
-            A list of tuples, with the first element being a Composition and
-            the second element being the number of points awarded that
-            Composition intepretation.
+            list[tuple[Composition, int]]: A list of tuples, with the first element being a Composition
+                and the second element being the number of points awarded that Composition interpretation.
         """
         m_dict = m_dict or {}
 
@@ -1126,7 +1167,7 @@ def reduce_formula(sym_amt, iupac_ordering: bool = False) -> Tuple[str, float]:
             Table VI of "Nomenclature of Inorganic Chemistry (IUPAC
             Recommendations 2005)". This ordering effectively follows
             the groups and rows of the periodic table, except the
-            Lanthanides, Actanides and hydrogen. Note that polyanions
+            Lanthanides, Actinides and hydrogen. Note that polyanions
             will still be determined based on the true electronegativity of
             the elements.
 
@@ -1168,10 +1209,8 @@ def reduce_formula(sym_amt, iupac_ordering: bool = False) -> Tuple[str, float]:
 
 class ChemicalPotential(dict, MSONable):
     """
-    Class to represent set of chemical potentials. Can be:
-    multiplied/divided by a Number
-    multiplied by a Composition (returns an energy)
-    added/subtracted with other ChemicalPotentials.
+    Class to represent set of chemical potentials. Can be: multiplied/divided by a Number
+    multiplied by a Composition (returns an energy) added/subtracted with other ChemicalPotentials.
     """
 
     def __init__(self, *args, **kwargs):