-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PBP dataset performance enhancements #361
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some short questions. I am struggling a bit giving a good review because I am not very familiar with the multiprocessing in such extend.
apax/data/input_pipeline.py
Outdated
labels["energy"][i] = lab["energy"] | ||
if self.forces: | ||
labels["forces"][i, : inp["n_atoms"]] = lab["forces"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are other labels treated, e.g. stress
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I removed the inclusion of stress... Good catch
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I have reworked the PBP dataset to enhance performance.
Previously, the buffer would be filled with batches using multiprocessing, run dry and then be refilled.
Now, there is a dedicated processing thread which uses MP internally.
That means the buffer is now continuously filled without blocking train step compute.
On my home PC I have achieved a 20 % epoch time reduction compared to the cached dataset.
On my work laptop, it's roughly 30 % slower, but still much faster than the old pbp (old: 6.3 s, new: 1.3 s)
Apart from this dataset, there is also the CSVLogger class which uses tensorflow.
I think if we write our own logger, we could turn tensorflow into an optional dependency.
TODO:
Update:
After rerunning the benchmark on my workstation, the cached dataset is still faster than the PBB, especially in worst case scenarios (same system size, low batch size, lightweight model).
300 atom system, batch size 4, rmax=6
cached: 1.5 s/epoch
pbp new : 2.5 s/epoch
pbp old: 4.7 s/epoch
With BS 8, n_ens = 8, rmax=5
cached: 5.6 s/epoch
pbp: 6.7 s/epoch