Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alter OncePerSparkSession to be instantiatable even without provided SparkSession #82

Closed
benedeki opened this issue Mar 2, 2023 · 1 comment
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@benedeki
Copy link
Contributor

benedeki commented Mar 2, 2023

Background

Currently OnlyOncePerSparkSession can be instantiated only when provided with SparkSession. This turns out to be limiting for classes that needs to be created before Spark is available.

Feature

  • add a new constructor which takes no parameters
  • keep the old constructor, including autoegistering step
  • make register method public, while keeping the feature that actual registration happens only once per SparkSession

Up to consideration if to make the SparkSession parameter of the old constructor explicit, to avoid confusion when the (auto-) registration happened and when not.
On the other hand that would create a breaking change for all current users.

@benedeki benedeki added enhancement New feature or request good first issue Good for newcomers labels Mar 2, 2023
@benedeki
Copy link
Contributor Author

benedeki commented Mar 8, 2023

Bonus
Please change the signature of the method register to return True if the actual registration happened or False if it has happened in an earlier call.
(I think it's such a small change, no point creating a dedicated ticket to it)

TebaleloS added a commit that referenced this issue Mar 10, 2023
TebaleloS added a commit that referenced this issue Mar 13, 2023
TebaleloS added a commit that referenced this issue Mar 15, 2023
TebaleloS added a commit that referenced this issue Mar 15, 2023
TebaleloS added a commit that referenced this issue Mar 15, 2023
TebaleloS added a commit that referenced this issue Mar 16, 2023
TebaleloS added a commit that referenced this issue Mar 16, 2023
TebaleloS added a commit that referenced this issue Mar 17, 2023
TebaleloS added a commit that referenced this issue Mar 28, 2023
TebaleloS added a commit that referenced this issue Mar 29, 2023
@benedeki benedeki moved this from 🆕 To groom to ✅ Done in CPS small repos project Apr 27, 2023
TebaleloS added a commit that referenced this issue May 8, 2023
#93 (#94)

* #83: Create a Spike for error handling
* new functions `null_col` and `call_udf`
* `ErrorMessage` refactoring
* `ErrorHandling` trait designed to serve as the interface for different implementations
* Implement error handling by putting the info into column of `ErrorMessage` array
* numerous support classes

* * UT fix
* headers fix

* * Work in progress

* * Relatively big overwrite to use `map` instead of errCol and sequence of raw values
* `ErrorMessageSubmitJustErrorValue` class created to offer the ability to submit errors without source column but with error value

* * Forgotten `register` function call

* * line ends improved

* * ErrorMessageSubmits moved to submits sub-package
* `ErrorMessageSubmitWithoutColumn` changed from `case class` to `class` to allow inheritance
* some PR comments addressed

* * Added UTs for `ColumnOrValue`
* Fixed few minor things discovered by the UTs

* Fixes #82 - Added logic to filter out rows with errors

* Fixes #82

* Fixes #82 - created class

* Fixes #93

* Fixes #93

* Fixes #93

* Fixes #93 - Changed the logic for evaluate method

* Fixes #93 - Added scala documentation for ErrorHandlingFilterRowsWithErrors

* Fixes #93 - added ErrorHandlingFilterRowsWithErrors test file

* Fixes #93

* Fixes #93 - Object test progress.

* Fixes #93 - Object test progress

* Fixes #93 - Object test progress

* Fixes #93 - Object test progress

* * Forgotten `register` function call

* * line ends improved

* * ErrorMessageSubmits moved to submits sub-package
* `ErrorMessageSubmitWithoutColumn` changed from `case class` to `class` to allow inheritance
* some PR comments addressed

* * Added UTs for `ColumnOrValue`
* Fixed few minor things discovered by the UTs

* * Work in progress

* Fixes #93 - merged with #83

* Fixes #93 - merged with #83

* Merge branch 'feature/83-create-a-spike-for-error-handling' into feature/93-Implement-error-handling-that-will-filter-the-rows-that-have-any-error

* Unit test, work in progress

* Unit test, work in progress

* Fixes #93 - Refactored `doTheColumnAggregation` method

* #93 - Unit test work in progress

* * changed `ErrorMessageArrayTest` to actual test suite

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* Pull changes

* #93 - Unit test work in progress

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* #93 - Unit test work in progress

* #93 - Unit test work in progress

* #93 - Unit test work in progress

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* Update spark-commons/src/main/scala/za/co/absa/spark/commons/errorhandling/implementations/ErrorHandlingFilterRowsWithErrors.scala

Co-authored-by: David Benedeki <[email protected]>

* #93 - Unit test work in progress

* #93 - Unit test work in progress

* Fixes #93 conflicts

* Fixes #93

* Fixes #93

* Fixes #93

* Closes #93

---------

Co-authored-by: David Benedeki <[email protected]>
Co-authored-by: David Benedeki <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
Archived in project
Development

No branches or pull requests

1 participant