Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump apache-beam from 2.56.0 to 2.57.0 #33

Merged
merged 1 commit into from
Jun 27, 2024

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Jun 27, 2024

Bumps apache-beam from 2.56.0 to 2.57.0.

Release notes

Sourced from apache-beam's releases.

Beam 2.57.0 Release

We are happy to present the new 2.57.0 release of Beam. This release includes both improvements and new functionality. See the download page for this release.

For more information on changes in 2.57.0, check out the detailed release notes.

Highlights

  • Apache Beam adds Python 3.12 support (#29149).
  • Added FlinkRunner for Flink 1.18 (#30789).

I/Os

  • Ensure that BigtableIO closes the reader streams (#31477).

New Features / Improvements

  • Added Feast feature store handler for enrichment transform (Python) (#30957).
  • BigQuery per-worker metrics are reported by default for Streaming Dataflow Jobs (Java) (#31015)
  • Adds inMemory() variant of Java List and Map side inputs for more efficient lookups when the entire side input fits into memory.
  • Beam YAML now supports the jinja templating syntax. Template variables can be passed with the (json-formatted) --jinja_variables flag.
  • DataFrame API now supports pandas 2.1.x and adds 12 more string functions for Series.(#31185).
  • Added BigQuery handler for enrichment transform (Python) (#31295)
  • Disable soft delete policy when creating the default bucket for a project (Java) (#31324).
  • Added DoFn.SetupContextParam and DoFn.BundleContextParam which can be used as a python DoFn.process, Map, or FlatMap parameter to invoke a context manager per DoFn setup or bundle (analogous to using setup/teardown or start_bundle/finish_bundle respectively.)
  • Go SDK Prism Runner
    • Pre-built Prism binaries are now part of the release and are available via the Github release page. (#29697).
    • Some pipelines will work on Java and Python, but this is in part to prepare for real runner wrappers in 2.58.0
    • ProcessingTime is now handled synthetically with TestStream pipelines and Non-TestStream pipelines, for fast test pipeline execution by default. (#30083).
      • Prism does NOT yet support "real time" execution for this release.
  • Improve processing for large elements to reduce the chances for exceeding 2GB protobuf limits (Python)([https://redirect.github.com/[Bug]: Beam Python pipelines with large elements sometimes fail with: Exception serializing message: Elements exceeds maximum protobuf size of 2GB beam#31607]).

Breaking Changes

  • Java's View.asList() side inputs are now optimized for iterating rather than indexing when in the global window. This new implementation still supports all (immutable) List methods as before, but some of the random access methods like get() and size() will be slower. To use the old implementation one can use View.asList().withRandomAccess().
  • SchemaTransforms implemented with TypedSchemaTransformProvider now produce a configuration Schema with snake_case naming convention (#31374). This will make the following cases problematic:
    • Running a pre-2.57.0 remote SDK pipeline containing a 2.57.0+ Java SchemaTransform, and vice versa:

... (truncated)

Changelog

Sourced from apache-beam's changelog.

[2.57.0] - 2024-06-26

Highlights

  • Apache Beam adds Python 3.12 support (#29149).
  • Added FlinkRunner for Flink 1.18 (#30789).

I/Os

  • Ensure that BigtableIO closes the reader streams (#31477).

New Features / Improvements

  • Added Feast feature store handler for enrichment transform (Python) (#30957).
  • BigQuery per-worker metrics are reported by default for Streaming Dataflow Jobs (Java) (#31015)
  • Adds inMemory() variant of Java List and Map side inputs for more efficient lookups when the entire side input fits into memory.
  • Beam YAML now supports the jinja templating syntax. Template variables can be passed with the (json-formatted) --jinja_variables flag.
  • DataFrame API now supports pandas 2.1.x and adds 12 more string functions for Series.(#31185).
  • Added BigQuery handler for enrichment transform (Python) (#31295)
  • Disable soft delete policy when creating the default bucket for a project (Java) (#31324).
  • Added DoFn.SetupContextParam and DoFn.BundleContextParam which can be used as a python DoFn.process, Map, or FlatMap parameter to invoke a context manager per DoFn setup or bundle (analogous to using setup/teardown or start_bundle/finish_bundle respectively.)
  • Go SDK Prism Runner
    • Pre-built Prism binaries are now part of the release and are available via the Github release page. (#29697).
    • ProcessingTime is now handled synthetically with TestStream pipelines and Non-TestStream pipelines, for fast test pipeline execution by default. (#30083).
      • Prism does NOT yet support "real time" execution for this release.
  • Improve processing for large elements to reduce the chances for exceeding 2GB protobuf limits (Python)([https://redirect.github.com/[Bug]: Beam Python pipelines with large elements sometimes fail with: Exception serializing message: Elements exceeds maximum protobuf size of 2GB beam#31607]).

Breaking Changes

  • Java's View.asList() side inputs are now optimized for iterating rather than indexing when in the global window. This new implementation still supports all (immutable) List methods as before, but some of the random access methods like get() and size() will be slower. To use the old implementation one can use View.asList().withRandomAccess().
  • SchemaTransforms implemented with TypedSchemaTransformProvider now produce a configuration Schema with snake_case naming convention (#31374). This will make the following cases problematic:
    • Running a pre-2.57.0 remote SDK pipeline containing a 2.57.0+ Java SchemaTransform, and vice versa:
    • Running a 2.57.0+ remote SDK pipeline containing a pre-2.57.0 Java SchemaTransform
    • All direct uses of Python's SchemaAwareExternalTransform should be updated to use new snake_case parameter names.
  • Upgraded Jackson Databind to 2.15.4 (Java) (#26743). jackson-2.15 has known breaking changes. An important one is it imposed a buffer limit for parser. If your custom PTransform/DoFn are affected, refer to #31580 for mitigation.
Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [apache-beam](https://github.com/apache/beam) from 2.56.0 to 2.57.0.
- [Release notes](https://github.com/apache/beam/releases)
- [Changelog](https://github.com/apache/beam/blob/master/CHANGES.md)
- [Commits](apache/beam@v2.56.0...v2.57.0)

---
updated-dependencies:
- dependency-name: apache-beam
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Jun 27, 2024
@liferoad liferoad merged commit 8958145 into main Jun 27, 2024
8 checks passed
@dependabot dependabot bot deleted the dependabot/pip/apache-beam-2.57.0 branch June 27, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant