Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ksqlDB leaks admin client threads for terminated http2 push queries JOIN #6367

Closed
agavra opened this issue Oct 6, 2020 · 3 comments · Fixed by #6532
Closed

ksqlDB leaks admin client threads for terminated http2 push queries JOIN #6367

agavra opened this issue Oct 6, 2020 · 3 comments · Fixed by #6532
Assignees
Labels
bug P0 Denotes must-have for a given milestone
Milestone

Comments

@agavra
Copy link
Contributor

agavra commented Oct 6, 2020

Describe the bug
ksqlDB leaks threads when issued from the command line and then ^C

To Reproduce

  1. Run a ksqlDB server (happens on 0.13)
  2. Create a stream (CREATE STREAM a (id int key, col1 int) with (kafka_topic='a', value_format='json', partitions=1);)
  3. Create another stream (CREATE STREAM c (id int key, col1 int) with (kafka_topic='c', value_format='json', partitions=1);)
  4. Record the number of running threads
jstack `ps ax | grep ksql-server | awk '{print $1}' | head -1` | grep -oE '".*"' | wc -l
  1. Issue a query
curl --http2 -m 240 -d '{"sql":"SELECT * FROM a JOIN c WITHIN 10 second ON a.id = c.id EMIT CHANGES;"}' http://0.0.0.0:8088/query-stream
  1. Exit the query with ctrl+C
  2. Check the number of running threads

Expected behavior
The number of threads should eventually be the same

Actual behaviour
The number of threads is increased by 1

Additional context
It seems that the threads which are being leaked are admin client threads:

~ ❯ jstack `ps ax | grep ksql-server | awk '{print $1}' | head -1` | grep "admin-client" | wc -l                                                                  16:55:34
      61
~ ❯ jstack `ps ax | grep ksql-server | awk '{print $1}' | head -1` | grep "admin-client" | wc -l                                                                  16:56:26
      62

Here is the stack for one such thread:

"kafka-admin-client-thread | adminclient-66" #1198 daemon prio=5 os_prio=31 tid=0x00007f9c75780000 nid=0x28607 runnable [0x000070001a71b000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
        at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
        at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x0000000706797ce0> (a sun.nio.ch.Util$3)
        - locked <0x0000000706797cf0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000706797c90> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at org.apache.kafka.common.network.Selector.select(Selector.java:869)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:561)
        at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1319)

It also appears that we don't leak threads for just regular queries (non-join selects)... weird!

@apurvam apurvam added P0 Denotes must-have for a given milestone and removed needs-triage labels Oct 7, 2020
@agavra agavra added this to the 0.14.0 milestone Oct 9, 2020
@stevenpyzhang
Copy link
Member

stevenpyzhang commented Oct 27, 2020

With both the old and new query endpoints, a new admin client thread is created as part of checking the partition number of both source topics for the join. Whenever a request is sent to either endpoint DefaultApiSecurityContext.create(routingContext) a new service context is created.

With the old server endpoint, it looks like we're closing the admin client in a finally block here https://github.com/confluentinc/ksql/blob/master/ksqldb-rest-app/src/main/java/io/confluent/ksql/rest/server/KsqlServerEndpoints.java#L268

I think this is what's missing from the new server endpoint. The admin client is created and then it's never closed on the new endpoint.

@agavra
Copy link
Contributor Author

agavra commented Oct 27, 2020

nice find @stevenpyzhang 🥳

@apurvam
Copy link
Contributor

apurvam commented Oct 27, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug P0 Denotes must-have for a given milestone
Projects
None yet
3 participants