-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][ci] Disable test that causes OOME until the problem has been resolved #22586
[improve][ci] Disable test that causes OOME until the problem has been resolved #22586
Conversation
…n resolved
In one of the heap dumps, there was 251,029 lambdas which all reference a Using https://github.com/vlsi/mat-calcite-plugin to query the heap dump. select this['arg$2.completeTopicName'], count(*) from "org.apache.pulsar.broker.resources.NamespaceResources$PartitionedTopicResources$$Lambda$1819+0x00007f08a8b65ee8" group by 1
|
In another heapdump select this['arg$2.completeTopicName'], count(*) from "org.apache.pulsar.broker.resources.NamespaceResources$PartitionedTopicResources$$Lambda$3405+0x00007fae50f7b000" group by 1
|
There are a few recent replicator related changes #21946, #21948 and #22537 . @poorbarcode please check if one of the changes is triggering the OOME issue possibly related to deletion. There are a lot of entries for |
Just wondering if the problem is somehow related to namespace deletion with replication enabled. pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/NamespacesBase.java Lines 216 to 344 in d7d5452
The concurrency issue is explained in #22541 (comment) |
the namespace deletion in the test might be the code that triggers the problem: pulsar/pulsar-broker/src/test/java/org/apache/pulsar/broker/service/ReplicatorSubscriptionTest.java Lines 821 to 824 in e81a20d
@poorbarcode do you have a chance to debug this issue? |
There are more problems. Using heap dump from https://github.com/apache/pulsar/actions/runs/8835173621/attempts/1?pr=22583 select toString(this['stack.fn.arg$1']), count(*) from java.util.concurrent.CompletableFuture where this['stack.fn'] is not null group by 1 order by 2 desc
|
select toString(this['result.ex.detailMessage']), count(*) from java.util.concurrent.CompletableFuture where this['result.ex.detailMessage'] is not null group by 1 order by 2 desc
|
Motivation
Unit test group 1 fails often with OOME. (example)
Modifications
The issue is most like related to #21495 and org.apache.pulsar.broker.service.ReplicatorSubscriptionTest#testWriteMarkerTaskOfReplicateSubscriptions .
Disable the test until the problem has been resolved.
Documentation
doc
doc-required
doc-not-needed
doc-complete