[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22584

StefanRRichter · 2023-05-15T12:37:48Z

[FLINK-31963]

What is the purpose of the change

This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state. With this fix, state assignment does now consider such upstream/downstream states for rescaling and no longer skip the state reassignment in such cases for otherwise stateless operators.

Brief change log

Checking for upstream result partitions and downstream input channel states in StateAssignmentOperation before skipping reassignment.
Unit test
IT test

Verifying this change

This change added tests and can be verified as follows:

Run the added IT tests in UnalignedCheckpointRescaleITCase
Run the added unit tests in StateAssignmentOperationTest

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
The serializers: (no)
The runtime per-record code paths (performance sensitive): (no)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes: rescaling state)
The S3 file system connector: (no)

Documentation

Does this pull request introduce a new feature? (no)

This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state.

flinkbot · 2023-05-15T12:45:19Z

CI report:

5fc3c0a Azure: SUCCESS

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run azure re-run the last Azure build

pnowojski

LGTM % some minor comments. Let's also verify CI results.

pnowojski · 2023-05-15T12:56:29Z

...-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java

+        if ((upstreamOpState == null && downstreamOpState == null)
+                || (upstreamOpState != null && downstreamOpState != null)) {
+            // Either upstream or downstream state must exist, but not both.
+            return;
+        }


checkArgument(..., "Either upstream or downstream state must exist, but not both")?

pnowojski · 2023-05-15T13:02:19Z

...-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java

+        List<TaskStateSnapshot> upstreamRescalingDescriptors =
+                getRescalingDescriptorsFromVertex(upstreamExecutionJobVertex);
+        List<TaskStateSnapshot> downstreamRescalingDescriptors =
+                getRescalingDescriptorsFromVertex(downstreamExecutionJobVertex);


Rename RescalingDescriptors -> TaskStateSnapshots? You are obtaining descriptors from the TaskStateSnapshot in the next step within checkMappings.

pnowojski · 2023-05-15T13:05:30Z

...-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java

+        checkMappings(
+                upstreamRescalingDescriptors,
+                TaskStateSnapshot::getOutputRescalingDescriptor,
+                expectedUpstreamCount);
+
+        checkMappings(
+                downstreamRescalingDescriptors,
+                TaskStateSnapshot::getInputRescalingDescriptor,
+                expectedDownstreamCount);


nit: instead of lambda functions I would accept a little bit of code deduplication and replace those calls with:

checkMappings( upstreamTaskStateSnapshots.stream().map(TaskStateSnapshot::getOutputRescalingDescriptor), expectedUpstreamCount); checkMappings( downstreamTaskStateSnapshots.stream().map(TaskStateSnapshot::getInputRescalingDescriptor), expectedDownstreamCount);

Hm, but what is this improving?

When reading the code of checkMappings it's tricky to understand what does the extractFn do. But feel free to ignore this comment.

pnowojski · 2023-05-15T13:08:06Z

...-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java

+        Assert.assertEquals(
+                expectedCount,
+                taskStateSnapshots.stream()
+                        .map(extractFun)
+                        .mapToInt(
+                                x -> {
+                                    int len = x.getOldSubtaskIndexes(0).length;
+                                    // Assert that there is a mapping.
+                                    Assert.assertTrue(len > 0);
+                                    return len;
+                                })
+                        .sum());


Instead of asserting length of the mappings, should we assert the actual mappings? 🤔

I was thinking about this and decided to keep the test targeted at just checking that a remapping has happened. I'd hope there are already tests that check the correctness of such remappings thoroughly.

pnowojski · 2023-05-15T13:11:31Z

...ests/src/test/java/org/apache/flink/test/checkpointing/UnalignedCheckpointRescaleITCase.java

+    @Parameterized.Parameters(
+            name = "{0} {1} from {2} to {3}, sourceSleepMs = {4}, buffersPerChannel = {5}")


I would add a comment above this line explaining why do we want to have non zero sourceSleepMs sometimes. That we want to test the rescaling without backpressure with only occasional a couple of captured in-flight records .

flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java

...-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java

rkhachatryan

LGTM

rkhachatryan · 2023-05-16T09:59:09Z

...ests/src/test/java/org/apache/flink/test/checkpointing/UnalignedCheckpointRescaleITCase.java

+        // We use `sourceSleepMs` > 0 to test rescaling without backpressure and only very few
+        // captured in-flight records, see FLINK-31963.
        Object[][] parameters =
                new Object[][] {
-                    new Object[] {"downscale", Topology.KEYED_DIFFERENT_PARALLELISM, 12, 7},
-                    new Object[] {"upscale", Topology.KEYED_DIFFERENT_PARALLELISM, 7, 12},
-                    new Object[] {"downscale", Topology.KEYED_BROADCAST, 7, 2},
-                    new Object[] {"upscale", Topology.KEYED_BROADCAST, 2, 7},
-                    new Object[] {"downscale", Topology.BROADCAST, 5, 2},
-                    new Object[] {"upscale", Topology.BROADCAST, 2, 5},
-                    new Object[] {"upscale", Topology.PIPELINE, 1, 2},
-                    new Object[] {"upscale", Topology.PIPELINE, 2, 3},
-                    new Object[] {"upscale", Topology.PIPELINE, 3, 7},
-                    new Object[] {"upscale", Topology.PIPELINE, 4, 8},
-                    new Object[] {"upscale", Topology.PIPELINE, 20, 21},
-                    new Object[] {"downscale", Topology.PIPELINE, 2, 1},
-                    new Object[] {"downscale", Topology.PIPELINE, 3, 2},
-                    new Object[] {"downscale", Topology.PIPELINE, 7, 3},
-                    new Object[] {"downscale", Topology.PIPELINE, 8, 4},
-                    new Object[] {"downscale", Topology.PIPELINE, 21, 20},
-                    new Object[] {"no scale", Topology.PIPELINE, 1, 1},
-                    new Object[] {"no scale", Topology.PIPELINE, 3, 3},
-                    new Object[] {"no scale", Topology.PIPELINE, 7, 7},
-                    new Object[] {"no scale", Topology.PIPELINE, 20, 20},
-                    new Object[] {"upscale", Topology.UNION, 1, 2},
-                    new Object[] {"upscale", Topology.UNION, 2, 3},
-                    new Object[] {"upscale", Topology.UNION, 3, 7},
-                    new Object[] {"downscale", Topology.UNION, 2, 1},
-                    new Object[] {"downscale", Topology.UNION, 3, 2},
-                    new Object[] {"downscale", Topology.UNION, 7, 3},
-                    new Object[] {"no scale", Topology.UNION, 1, 1},
-                    new Object[] {"no scale", Topology.UNION, 7, 7},
-                    new Object[] {"upscale", Topology.MULTI_INPUT, 1, 2},
-                    new Object[] {"upscale", Topology.MULTI_INPUT, 2, 3},
-                    new Object[] {"upscale", Topology.MULTI_INPUT, 3, 7},
-                    new Object[] {"downscale", Topology.MULTI_INPUT, 2, 1},
-                    new Object[] {"downscale", Topology.MULTI_INPUT, 3, 2},
-                    new Object[] {"downscale", Topology.MULTI_INPUT, 7, 3},
-                    new Object[] {"no scale", Topology.MULTI_INPUT, 1, 1},
-                    new Object[] {"no scale", Topology.MULTI_INPUT, 7, 7},
+                    new Object[] {"downscale", Topology.KEYED_DIFFERENT_PARALLELISM, 12, 7, 0L},


nit: I'd consider combining a limited set of manually-crafted cases with a randomly generated ones (different on each run).
That would increase the coverage a bit given that there are a lot of runs on the CI.

…ckpoints. (apache#22584) This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state. (cherry picked from commit 354c0f4)

…ckpoints. (#22584) (#22595) This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state. (cherry picked from commit 354c0f4)

…ckpoints. (#22584) (#22594) This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state. (cherry picked from commit 354c0f4)

…ckpoints. (apache#22584) (apache#22594) This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state. (cherry picked from commit 354c0f4)

[FLINK-31963] Fix rescaling bug in recovery from unaligned checkpoints.

ea90d57

This commit fixes problems in StateAssignmentOperation for unaligned checkpoints with stateless operators that have upstream operators with output partition state or downstream operators with input channel state.

StefanRRichter requested review from rkhachatryan, akalash, dawidwys and pnowojski May 15, 2023 12:37

pnowojski reviewed May 15, 2023

View reviewed changes

Piotr review comments.

419f0db

rkhachatryan reviewed May 15, 2023

View reviewed changes

flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java Show resolved Hide resolved

...-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java Outdated Show resolved Hide resolved

Review comments Roman.

5fc3c0a

rkhachatryan approved these changes May 16, 2023

View reviewed changes

StefanRRichter merged commit 354c0f4 into apache:master May 16, 2023

StefanRRichter mentioned this pull request May 16, 2023

[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22594

Merged

StefanRRichter mentioned this pull request May 16, 2023

[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22595

Merged

flinkbot added the component=Runtime/Checkpointing label Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22584

[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22584

StefanRRichter commented May 15, 2023 •

edited

Loading

flinkbot commented May 15, 2023 •

edited

Loading

pnowojski left a comment

pnowojski May 15, 2023

pnowojski May 15, 2023

pnowojski May 15, 2023

StefanRRichter May 15, 2023

pnowojski May 15, 2023

pnowojski May 15, 2023

StefanRRichter May 15, 2023

pnowojski May 15, 2023

rkhachatryan left a comment

rkhachatryan May 16, 2023

		@Parameterized.Parameters(
		name = "{0} {1} from {2} to {3}, sourceSleepMs = {4}, buffersPerChannel = {5}")

[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22584

[FLINK-31963][state] Fix rescaling bug in recovery from unaligned checkpoints. #22584

Conversation

StefanRRichter commented May 15, 2023 • edited Loading

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented May 15, 2023 • edited Loading

CI report:

pnowojski left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkhachatryan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StefanRRichter commented May 15, 2023 •

edited

Loading

flinkbot commented May 15, 2023 •

edited

Loading