Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix no poke #6448

Merged
merged 2 commits into from
Feb 21, 2024
Merged

Fix no poke #6448

merged 2 commits into from
Feb 21, 2024

Conversation

greenape
Copy link
Member

@greenape greenape commented Feb 8, 2024

Closes #5763

I have:

  • Formatted any Python files with black
  • Brought the branch up to date with master
  • Added any relevant Github labels
  • Added tests for any new additions
  • Added or updated any relevant documentation
  • Added an Architectural Decision Record (ADR), if appropriate
  • Added an MPLv2 License Header if appropriate
  • Updated the Changelog

Description

Identifying why a file fdw table can't be read is as it turns out not possible. That means we can't poke only on missing files. Hence what I've done is to make the foreign table wrapper error if the table can't be read, and add an extra sensor which let's you require some number of rows be present rather than just that there's at least a header.


@greenape greenape added bug Something isn't working FlowETL labels Feb 8, 2024
Copy link

cypress bot commented Feb 8, 2024

Passing run #21697 ↗︎

0 4 0 0 Flakiness 0

Details:

Add a sensor which checks for a minimum number of rows
Project: FlowAuth Commit: a0eb43265b
Status: Passed Duration: 00:49 💡
Started: Feb 19, 2024 1:02 PM Ended: Feb 19, 2024 1:03 PM

Review all test suite changes for PR #6448 ↗︎

Copy link

codecov bot commented Feb 8, 2024

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (e0c1fa5) 92.97% compared to head (a0eb432) 92.95%.

Files Patch % Lines
...l/flowetl/flowetl/sensors/n_rows_present_sensor.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6448      +/-   ##
==========================================
- Coverage   92.97%   92.95%   -0.02%     
==========================================
  Files         263      264       +1     
  Lines       10302    10304       +2     
  Branches      835      835              
==========================================
  Hits         9578     9578              
- Misses        596      598       +2     
  Partials      128      128              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Thingus Thingus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@jc-harrison jc-harrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach makes sense. Although this still leaves us in the situation of using the fail/retry mechanism in lieu of a sensor; it's just that it will now be the CreateForeignStagingTableOperator task that fails and retries until the file appears, rather than the "wait for data" task. I wonder whether we could add a FileExistsSensor before creating the foreign table, using pg_stat_file or similar?

@greenape
Copy link
Member Author

Yeah I think possible - when I originally looked at the docs for stat_file I read it as limited to just the pg directories, but actually on re-reading that's only the case sans super user which the executing user here is.

@greenape
Copy link
Member Author

Might be better off with pg_ls_dir thinking about it, given that in some cases there are multiple files.

@greenape greenape added the ready-to-merge Label indicating a PR is OK to automerge label Feb 21, 2024
@mergify mergify bot merged commit 45ee3fb into master Feb 21, 2024
40 of 42 checks passed
@mergify mergify bot deleted the fix-no-poke branch February 21, 2024 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working FlowETL ready-to-merge Label indicating a PR is OK to automerge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FlowETL DataPresentSensor fails if file does not exist
3 participants