Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean Up OOM Observer Remote Uploader Download path #3070

Merged
merged 5 commits into from
Feb 28, 2024

Conversation

j316chuck
Copy link
Contributor

@j316chuck j316chuck commented Feb 28, 2024

What does this PR do?

Clean Up OOM Observer Remote Uploader Download Path

What issue(s) does this change relate to?

For this run: oom-callback-bfloat16-2-26-cupid-convert-test-64-r15z1-hUuEG0

We have weird file path structure failure.oomrank0.oom_memory_flamegraph

2024-02-27 23:53:00,961: rank0[173][MainThread]: INFO: composer.callbacks.oom_observer: Uploading memory visualization to remote: chuck/oom_observer/oom_traces/failure.oomrank0.oom_snapshot.pickle from traces/rank0.oom_snapshot.pickle
2024-02-27 23:53:00,962: rank0[173][MainThread]: INFO: composer.callbacks.oom_observer: Uploading memory visualization to remote: chuck/oom_observer/oom_traces/failure.oomrank0.oom_trace_plot.html from traces/rank0.oom_trace_plot.html
2024-02-27 23:53:00,963: rank0[173][MainThread]: INFO: composer.callbacks.oom_observer: Uploading memory visualization to remote: chuck/oom_observer/oom_traces/failure.oomrank0.oom_segment_plot.html from traces/rank0.oom_segment_plot.html
2024-02-27 23:53:00,964: rank0[173][MainThread]: INFO: composer.callbacks.oom_observer: Uploading memory visualization to remote: chuck/oom_observer/oom_traces/failure.oomrank0.oom_segment_flamegraph.svg from traces/rank0.oom_segment_flamegraph.svg
2024-02-27 23:53:00,965: rank0[173][MainThread]: INFO: composer.callbacks.oom_observer: Uploading memory visualization to remote: chuck/oom_observer/oom_traces/failure.oomrank0.oom_memory_flamegraph.svg from traces/rank0.oom_memory_flamegraph.svg

This PR fixes this issue.

Example test run: oom-callback-bfloat16-2-26-cupid-convert-test-64-r15z1-yy0h5A 🟢

@j316chuck j316chuck requested a review from a team as a code owner February 28, 2024 00:52
@j316chuck j316chuck requested a review from cli99 February 28, 2024 00:53
@j316chuck j316chuck enabled auto-merge (squash) February 28, 2024 01:02
@j316chuck j316chuck disabled auto-merge February 28, 2024 01:25
@j316chuck j316chuck changed the title Fix OOM Observer Remote Uploader Download path Clean Up OOM Observer Remote Uploader Download path Feb 28, 2024
@j316chuck j316chuck merged commit 6c833c6 into dev Feb 28, 2024
14 checks passed
@j316chuck j316chuck deleted the chuck/oom_observer_callback_path_fix branch February 28, 2024 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants