Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement error handling by putting the info into _Spark_'s standard error column (String) #86

Open
benedeki opened this issue Mar 15, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@benedeki
Copy link
Contributor

benedeki commented Mar 15, 2023

Background

One of the provided ErrorHandling implementations. Title is actually little misleading, point is to write the errors into string column, and the column name should default into spark.sql.columnNameOfCorruptRecord (See Runtime SQL Configuration)

Feature

Write errors into a StringType column, by converting each error submit filed into a string and concatenating them with a delimiter. The column name should/might default to spark.sql.columnNameOfCorruptRecord

Proposed Solution

Solution Ideas:

  • Configurable error delimiter (delimiter between different errors), default \n
  • Configurable error field delimiter, default ,
  • Enable quoting? Probably yes
@benedeki benedeki added the enhancement New feature or request label Mar 15, 2023
@benedeki benedeki moved this from 🆕 To groom to 📋 Backlog in CPS small repos project Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 📋 Backlog
Development

No branches or pull requests

1 participant