Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable blob level Lineage metrics for FileSystems #32642

Closed
wants to merge 1 commit into from

Conversation

Abacn
Copy link
Contributor

@Abacn Abacn commented Oct 3, 2024

introduced in #32090 (2.59.0 for Java) and #32430 (2.60.0 for Python),

There are use case of read/write millions of files in a pipeline, reporting lineage resulted in big stringset metrics that causing job status response size exceeding some internal limit, thus affecting visual / functionality relied on metrics (job progress, other user counter, etc). Symptom includes progress bar stall for batch job, user counter increment incomplete or dropped, etc

Until Beam and/or backend can handle and/or guard from large number of metrics, this PR mitigate the issue by only report bucket level Lineage

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

Copy link
Contributor

github-actions bot commented Oct 3, 2024

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@Abacn Abacn added this to the 2.60.0 Release milestone Oct 3, 2024
@rohitsinha54
Copy link
Contributor

Thank you for prompt changes. Changes look good to me for immediate fix.
We should also test this on

  1. really large files with wildcard and
  2. sharded files

Copy link
Contributor

@rohitsinha54 rohitsinha54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one way to test both both wild card and sharded wil be

dummy job which write 1 to 100k to sharded file one number each file named with num 1.txt, 2.txt and then reading that back with wildcard. WDYT?

@Abacn
Copy link
Contributor Author

Abacn commented Oct 4, 2024

Tested TextIOIT: write then read 100,000 files

  • user counter elements_read 100,000,000, which means no counter dropped

  • job progress works when job running

Dataflow job id: 2024-10-03_20_13_23-15614223084919276642

In comparison (on master):

image

It stucks at update string set (see also #32649)

job id 2024-10-03_20_13_23-15614223084919276642

@Abacn Abacn mentioned this pull request Oct 4, 2024
3 tasks
@Abacn
Copy link
Contributor Author

Abacn commented Oct 4, 2024

superceded by #32662

@Abacn Abacn closed this Oct 4, 2024
@Abacn Abacn reopened this Oct 8, 2024
@Abacn
Copy link
Contributor Author

Abacn commented Oct 8, 2024

Since there are ongoing discussions on #32662 and #32650, I propose to re-open this PR to unblock 2.60.0 release

cc: @damccorm @rohitsinha54

update : going forward with #32662 for now

@Abacn Abacn marked this pull request as draft October 8, 2024 22:01
@Abacn Abacn closed this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants