-
Notifications
You must be signed in to change notification settings - Fork 17
[Meta][Feature] Implement the memory queue and output pipeline #7
Comments
So what will happen when the source/producer is syslog or any normal udp stream for that matter? |
The only time the queue should fill is when the output (Elasticsearch/Logstash/Kafka) is unavailable or can't keep up with the data volume. If that situation persists for long enough eventually there will be data loss. When that point is depends on the data volume, queue size, and duration of the problem causing the queue to fill. |
That sounds exactly right. On our older system(Graylog) we've had a 600Gb queue(disk journal) that allowed us to survive a 24h ElasticSearch downtime. When one such queue got ~95% full we where declaring that node dead and the loadbalancer in front moved the stream to a second node. My initial question was: How would you anounce/backpressure to the source producer when this is syslog or any normal udp stream for that matter? |
Closing this as completed, queue+output work will continue with separate issues. |
This is a feature meta issue to implement the memory queue to output pipeline in the shipper. The scope is restricted to implementation of the memory queue and an output with no external dependencies (the console or file output for example). The disk queue, Elasticsearch/Kafka/Logstash outputs, and processors are explicitly out of scope.
This feature is considered complete when at least the following criteria are satisfied:
The assignee of this issue is expected to create the development plan with all child issues for this feature. The following set of tasks should be included in the initial issues at a minimum:
Important milestones:
The text was updated successfully, but these errors were encountered: