Properly parse energy states and occupations #832

pmrv · 2022-10-22T13:03:15Z

Due to a faulty regex previously we threw away the signs of both energy states and occupations parsed from sphinx logs.

coveralls · 2022-10-22T13:11:22Z

Pull Request Test Coverage Report for Build 3402601774

3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.007%) to 68.645%

Files with Coverage Reduction	New Missed Lines	%
pyiron_atomistics/lammps/structure.py	1	82.03%

Totals
Change from base Build 3401874913:	-0.007%
Covered Lines:	12085
Relevant Lines:	17605

💛 - Coveralls

niklassiemer · 2022-10-22T13:26:21Z

pyiron_atomistics/sphinx/base.py

@@ -2124,7 +2124,7 @@ def n_steps(self):
    def _parse_band(self, term):
        fa = re.findall(term, self.log_main, re.MULTILINE)
        arr = (
-            np.array(re.sub("[^0-9\. ]", "", "".join(fa)).split())
+            np.array(re.sub("[^-0-9\. ]", "", "".join(fa)).split())


Now this only works on negative quantities?!

wouldn't this allow, but not require a - sign? (Not tested!)

Suggested change

np.array(re.sub("[^-0-9\. ]", "", "".join(fa)).split())

np.array(re.sub("[^-?0-9\. ]", "", "".join(fa)).split())

No, this regex matches everything [] that is not ^ a -, 0-9, \. (plain dot, \ is for escaping) or and replaces it by the empty string. This leaves just numbers and spaces, which are then "parsed". Because the regex previously also matched the - the parsed number were effectively only the absolute value.

Actually the whole regex business is too complicated, because in the sphinx log the lines are of the form some text: number number number. So removing that prefix would be enough.

Well in that case, it souds like an easier and more stable solution to replace everything until a colon (being greedy it will be the last colon) by the empty string and parsing the numbers. Right now, a single - in the string would again flip the sign of the first number...

The weird double application of regex matching was only to remove a (constant) prefix. This is more elegantly done with capture groups.

Properly parse energy states and occupations

b1d8e64

Due to a faulty regex previously we threw away the signs of both energy states and occupations parsed from sphinx logs.

pmrv added the bug Something isn't working label Oct 22, 2022

niklassiemer reviewed Oct 22, 2022

View reviewed changes

pmrv added the integration Start the notebook integration tests for this PR label Oct 24, 2022

Use re capture groups

d939b41

The weird double application of regex matching was only to remove a (constant) prefix. This is more elegantly done with capture groups.

pmrv force-pushed the sphinx_sign branch from 75634e5 to d939b41 Compare November 6, 2022 02:36

pmrv added the format_black reformat the code using the black standard label Nov 6, 2022

Format black

9d968c4

pmrv merged commit cf713bb into master Nov 7, 2022

delete-merged-branch bot deleted the sphinx_sign branch November 7, 2022 09:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly parse energy states and occupations #832

Properly parse energy states and occupations #832

pmrv commented Oct 22, 2022

coveralls commented Oct 22, 2022 •

edited

Loading

niklassiemer Oct 22, 2022

niklassiemer Oct 23, 2022

pmrv Oct 23, 2022

niklassiemer Oct 23, 2022

	np.array(re.sub("[^-0-9\. ]", "", "".join(fa)).split())
	np.array(re.sub("[^-?0-9\. ]", "", "".join(fa)).split())

Properly parse energy states and occupations #832

Properly parse energy states and occupations #832

Conversation

pmrv commented Oct 22, 2022

coveralls commented Oct 22, 2022 • edited Loading

Pull Request Test Coverage Report for Build 3402601774

💛 - Coveralls

niklassiemer Oct 22, 2022

Choose a reason for hiding this comment

niklassiemer Oct 23, 2022

Choose a reason for hiding this comment

pmrv Oct 23, 2022

Choose a reason for hiding this comment

niklassiemer Oct 23, 2022

Choose a reason for hiding this comment

coveralls commented Oct 22, 2022 •

edited

Loading