You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the logs one can observe that we run multiple InvocationTasks for interpreter.CommandInterpreter-FwoTCgo1MzQzOTQ3MzA1EAMYDiCIJxAi-0188e298f31e72eaa312d51064b4ce8d which is very strange. One instance of this invocation is being executed for partition leader epoch (9, 1) and the other task is run for (8, 1). Once the second task wants to append a GetStateEntry for the journal index 2, the runtime panics with
thread 'restate' panicked at 'assertion failed: `(left == right)`
left: `2`,
right: `3`: Expect to receive next journal entry for interpreter.CommandInterpreter-FwoTCgo1MzQzOTQ3MzA1EAMYDiCIJxAi-0188e298f31e72eaa312d51064b4ce8d', /restate/src/worker/src/partition/state_machine/mod.rs:434:9
stack backtrace:
So it looks as if those two InvocationTasks, even though they are executed for different partition leader epochs (PartitionInvocationStateMachineCoordinators), seem to produce into the same partition processor.
A slightly related problem is why do we run multiple InvocationTasks for the same sid?
I think this is caused by the network using a different partition table than what the partition processors believe they are responsible. Due this aspect, it can happen at recovery time that invocations are being processed by two partition processors (one that gets the message via the shuffle/network and another retrieving it from RocksDB).
To fix this problem, we need to update the network to use the proper partitioning table. Additionally, we should add a check in the partition processor which verifies that it processes messages for the right partition range.
In the logs one can observe that we run multiple
InvocationTasks
forinterpreter.CommandInterpreter-FwoTCgo1MzQzOTQ3MzA1EAMYDiCIJxAi-0188e298f31e72eaa312d51064b4ce8d
which is very strange. One instance of this invocation is being executed for partition leader epoch(9, 1)
and the other task is run for(8, 1)
. Once the second task wants to append aGetStateEntry
for the journal index 2, the runtime panics withSo it looks as if those two InvocationTasks, even though they are executed for different partition leader epochs (
PartitionInvocationStateMachineCoordinators
), seem to produce into the same partition processor.A slightly related problem is why do we run multiple
InvocationTasks
for the same sid?https://github.com/restatedev/restate/actions/runs/5343947305/jobs/9688317378#step:14:2093
container-logs (2).zip
The text was updated successfully, but these errors were encountered: