Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(weave_query): Handle stale file handle os errors on table load #3401

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

domphan-wandb
Copy link
Contributor

@domphan-wandb domphan-wandb commented Jan 14, 2025

Description

Fixes the panel crash reported in WB-22355

The panel crash happens due to an unhandled 'stale file handle' error when query engine is executing file-table. See screenshot below from this trace.

This panel crash happens very rarely as it took almost a week for repro & trace capture. We're unsure to what causes this, but we're handling the error by returning a None value for the file-table op which is consistent with how we handle pulling files from run summary.

image

Testing

Using the invoker.ini dev environment.

I simulated a stale file handle error by raising an OSError exception with errno.ESTALE in the opening of the json table and confirmed that the panel on the frontend no longer crashes.

@domphan-wandb domphan-wandb requested a review from a team as a code owner January 14, 2025 18:10
@circle-job-mirror
Copy link

circle-job-mirror bot commented Jan 14, 2025

Comment on lines +864 to +865
if e.errno == errno.ESTALE:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to re-raise the exception if it doesn't fall in this case. Something like:

Suggested change
if e.errno == errno.ESTALE:
return None
if e.errno == errno.ESTALE:
return None
else:
raise e

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh shoot, good catch, thanks @nicholaspun-wandb !

@domphan-wandb domphan-wandb force-pushed the dom/stale-file-handle-error branch from 14ae71e to ff53a8e Compare January 14, 2025 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants