You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the LambdaSink sends all incoming records immediately to AWS Lambda, causing multiple small invocations if thresholds are not met. We want to keep a stateful (persistent) buffer in the sink so it only flushes full batches immediately (when thresholds are exceeded) and persists any partial (incomplete) batch until more events arrive or until the sink shuts down. This ensures fewer, larger Lambda invocations and avoids prematurely flushing partial data.
Persisting the last incomplete batch should meet the following conditions
(a) becomes full or
(b) the sink shuts down.
Details
Persistent Buffer
A single buffer (persistentBuffer) accumulates events across multiple doOutput() calls.
Only when size or event-count thresholds are reached do we treat that batch as “full” and flush it to Lambda.
The partial batch remains in memory until it either becomes full or the sink shuts down.
N-1 “full” buffers get flushed immediately,
The Nth (partial) buffer remains in memory until the next doOutput() call or shutdown().
Ensure Thread-Safety
When using persistent buffer, make sure we dont hit race conditions when multiple threads might write to buffer.
The text was updated successfully, but these errors were encountered:
Summary
Currently, the LambdaSink sends all incoming records immediately to AWS Lambda, causing multiple small invocations if thresholds are not met. We want to keep a stateful (persistent) buffer in the sink so it only flushes full batches immediately (when thresholds are exceeded) and persists any partial (incomplete) batch until more events arrive or until the sink shuts down. This ensures fewer, larger Lambda invocations and avoids prematurely flushing partial data.
Persisting the last incomplete batch should meet the following conditions
(a) becomes full or
(b) the sink shuts down.
Details
Persistent Buffer
A single buffer (persistentBuffer) accumulates events across multiple doOutput() calls.
Only when size or event-count thresholds are reached do we treat that batch as “full” and flush it to Lambda.
The partial batch remains in memory until it either becomes full or the sink shuts down.
N-1 “full” buffers get flushed immediately,
The Nth (partial) buffer remains in memory until the next doOutput() call or shutdown().
Ensure Thread-Safety
When using persistent buffer, make sure we dont hit race conditions when multiple threads might write to buffer.
The text was updated successfully, but these errors were encountered: