PBP dataset performance enhancements #361

M-R-Schaefer · 2024-11-04T14:50:46Z

I have reworked the PBP dataset to enhance performance.
Previously, the buffer would be filled with batches using multiprocessing, run dry and then be refilled.
Now, there is a dedicated processing thread which uses MP internally.
That means the buffer is now continuously filled without blocking train step compute.

On my home PC I have achieved a 20 % epoch time reduction compared to the cached dataset.
On my work laptop, it's roughly 30 % slower, but still much faster than the old pbp (old: 6.3 s, new: 1.3 s)

Apart from this dataset, there is also the CSVLogger class which uses tensorflow.
I think if we write our own logger, we could turn tensorflow into an optional dependency.

TODO:

Update:

After rerunning the benchmark on my workstation, the cached dataset is still faster than the PBB, especially in worst case scenarios (same system size, low batch size, lightweight model).
300 atom system, batch size 4, rmax=6

cached: 1.5 s/epoch
pbp new : 2.5 s/epoch
pbp old: 4.7 s/epoch

With BS 8, n_ens = 8, rmax=5
cached: 5.6 s/epoch
pbp: 6.7 s/epoch

for more information, see https://pre-commit.ci

…eading

for more information, see https://pre-commit.ci

…eading

PythonFZ

Some short questions. I am struggling a bit giving a good review because I am not very familiar with the multiprocessing in such extend.

apax/data/input_pipeline.py

PythonFZ · 2024-11-19T13:00:43Z

apax/data/input_pipeline.py

+            labels["energy"][i] = lab["energy"]
+            if self.forces:
+                labels["forces"][i, : inp["n_atoms"]] = lab["forces"]


How are other labels treated, e.g. stress?

oh, I removed the inclusion of stress... Good catch

apax/data/input_pipeline.py

apax/md/ase_calc.py

apax/data/input_pipeline.py

for more information, see https://pre-commit.ci

…eading

for more information, see https://pre-commit.ci

PythonFZ

LGTM 👍

M-R-Schaefer and others added 15 commits November 3, 2024 13:34

reworked pbp datset to use a side thread

9e41172

reduced memory allocations in BatchProcessor

70d8fd8

switched to vesin

31e3ccf

test

2bd520f

fixed thread not ending after training is done

16169a6

[pre-commit.ci] auto fixes from pre-commit.com hooks

1b9dc9a

for more information, see https://pre-commit.ci

switched NL computation in preprocessing and ASE to vesin

ed42a70

[pre-commit.ci] auto fixes from pre-commit.com hooks

924de56

for more information, see https://pre-commit.ci

remoed barrier wait

b5e9cb2

Merge branch 'threading' of https://github.com/apax-hub/apax into thr…

f576771

…eading

Merge branch 'main' into threading

71fb826

[pre-commit.ci] auto fixes from pre-commit.com hooks

bdf809a

for more information, see https://pre-commit.ci

added vesin

b1071fb

fixed ase clac test

6bc8b73

[pre-commit.ci] auto fixes from pre-commit.com hooks

7617acf

for more information, see https://pre-commit.ci

M-R-Schaefer added the pre-commit.ci autofix label Nov 17, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

02acad6

for more information, see https://pre-commit.ci

pre-commit-ci bot removed the pre-commit.ci autofix label Nov 17, 2024

M-R-Schaefer marked this pull request as ready for review November 17, 2024 12:07

M-R-Schaefer and others added 2 commits November 17, 2024 13:26

updated config

6315eaa

[pre-commit.ci] auto fixes from pre-commit.com hooks

77cbc4d

for more information, see https://pre-commit.ci

M-R-Schaefer added the pre-commit.ci autofix label Nov 17, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

91bb8a1

for more information, see https://pre-commit.ci

pre-commit-ci bot removed the pre-commit.ci autofix label Nov 17, 2024

Merge branch 'main' into threading

42ee7f8

M-R-Schaefer requested review from PythonFZ and removed request for PythonFZ November 19, 2024 12:50

M-R-Schaefer added 2 commits November 19, 2024 14:05

removed matscipy dependency, poetry update

7caad5d

Merge branch 'threading' of https://github.com/apax-hub/apax into thr…

8510b05

…eading

PythonFZ reviewed Nov 19, 2024

View reviewed changes

M-R-Schaefer commented Nov 19, 2024

View reviewed changes

apax/data/input_pipeline.py Outdated Show resolved Hide resolved

M-R-Schaefer and others added 6 commits November 19, 2024 14:36

fixed imports, docstring, NL padding

5e20e7b

updated ase calc docstring to vesin

0c720b7

[pre-commit.ci] auto fixes from pre-commit.com hooks

ecbb4f0

for more information, see https://pre-commit.ci

added stress to batch processor

1519188

Merge branch 'threading' of https://github.com/apax-hub/apax into thr…

010dc91

…eading

[pre-commit.ci] auto fixes from pre-commit.com hooks

00388f9

for more information, see https://pre-commit.ci

M-R-Schaefer requested a review from PythonFZ November 20, 2024 11:23

PythonFZ approved these changes Nov 22, 2024

View reviewed changes

PythonFZ and others added 2 commits November 22, 2024 16:37

Merge branch 'main' into threading

ca5fbd7

Merge branch 'main' into threading

d5747ce

M-R-Schaefer merged commit 2409c83 into main Nov 25, 2024
2 checks passed

M-R-Schaefer deleted the threading branch November 25, 2024 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PBP dataset performance enhancements #361

PBP dataset performance enhancements #361

M-R-Schaefer commented Nov 4, 2024 •

edited

Loading

PythonFZ left a comment

PythonFZ Nov 19, 2024

M-R-Schaefer Nov 19, 2024

PythonFZ left a comment

PBP dataset performance enhancements #361

PBP dataset performance enhancements #361

Conversation

M-R-Schaefer commented Nov 4, 2024 • edited Loading

PythonFZ left a comment

Choose a reason for hiding this comment

PythonFZ Nov 19, 2024

Choose a reason for hiding this comment

M-R-Schaefer Nov 19, 2024

Choose a reason for hiding this comment

PythonFZ left a comment

Choose a reason for hiding this comment

M-R-Schaefer commented Nov 4, 2024 •

edited

Loading