Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed document handler #95534

Open
1 of 8 tasks
felixbarny opened this issue Apr 25, 2023 · 3 comments
Open
1 of 8 tasks

Failed document handler #95534

felixbarny opened this issue Apr 25, 2023 · 3 comments
Assignees
Labels
:Data Management/Data streams Data streams and their lifecycles >feature Team:Data Management Meta label for data/management team

Comments

@felixbarny
Copy link
Member

felixbarny commented Apr 25, 2023

Instead of dropping documents that have failed ingestion due to an exception during pipeline execution or indexing, it should be possible to store the failed document.

High-level options where to store the failed documents:

  1. In a dedicated data stream
  2. Within the same data stream, in dedicated failure backing indices
  3. Within the same index, only storing the _source and with an indicator that these reflect failed documents

Tasks

Preview Give feedback
@felixbarny felixbarny added the :Data Management/Data streams Data streams and their lifecycles label Apr 25, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Apr 25, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@lachlann562
Copy link

This would be extremely helpful, we recently discovered we were losing a huge number of messages because of inconsistency in data type and we couldn't find any easy way to identify these messages or recover the data associated with them. The abscence of log records is a huge risk when you are reliant on ELK as the "source of truth".

@ruflin
Copy link
Contributor

ruflin commented May 31, 2024

@lachlann562 Glad to hear this will be useful for you. To also expose the failure in the UI as soon as the feature lands, we are working on a dataset quality page. Some related issue can be found here: elastic/kibana#184572

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >feature Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

6 participants