Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread locked after stress test #5443

Closed
yuhaibohotmail opened this issue Nov 13, 2019 · 15 comments
Closed

Thread locked after stress test #5443

yuhaibohotmail opened this issue Nov 13, 2019 · 15 comments
Labels
kind/bug Something isn't working
Milestone

Comments

@yuhaibohotmail
Copy link

Describe the bug
(Describe the problem clearly and concisely.)

I did a simple stress test,then the program keeps giving alarm log.

21:59:37 WARNING [io.ve.co.im.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-worker-thread-9,5,main]=Thread[vert.x-worker-thread-9,5,main] has been blocked for 3382564 ms, time limit is 60000 ms: io.vertx.core.VertxException: Thread blocked
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.awaitWriteable(VertxBlockingOutput.java:111)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.write(VertxBlockingOutput.java:71)
        at io.quarkus.resteasy.runtime.standalone.VertxHttpResponse.writeBlocking(VertxHttpResponse.java:179)
        at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:69)
        at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:42)
        at org.jboss.resteasy.util.CommitHeaderOutputStream.write(CommitHeaderOutputStream.java:64)
        at org.jboss.resteasy.util.DelegatingOutputStream.write(DelegatingOutputStream.java:42)
        at org.jboss.resteasy.plugins.providers.jsonb.JsonBindingProvider.writeTo(JsonBindingProvider.java:137)
        at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.writeTo(AbstractWriterInterceptorContext.java:193)
        at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.writeTo(ServerWriterInterceptorContext.java:64)
        at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.proceed(AbstractWriterInterceptorContext.java:155)
        at org.jboss.resteasy.core.ServerResponseWriter.lambda$writeNomapResponse$2(ServerResponseWriter.java:156)
        at org.jboss.resteasy.core.ServerResponseWriter$$Lambda$417/1780592217.run(Unknown Source)
        at org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl.filter(ContainerResponseContextImpl.java:404)
        at org.jboss.resteasy.core.ServerResponseWriter.executeFilters(ServerResponseWriter.java:232)
        at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:97)
        at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:70)
        at org.jboss.resteasy.core.SynchronousDispatcher.writeResponse(SynchronousDispatcher.java:578)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:508)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:252)
        at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$332/1061176646.run(Unknown Source)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:153)
        at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$338/1804746610.get(Unknown Source)
        at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:363)
        at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:156)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:238)
        at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:73)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:109)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatchRequestContext(VertxRequestHandler.java:84)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.lambda$handle$0(VertxRequestHandler.java:71)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$$Lambda$313/2064123549.handle(Unknown Source)
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:316)
        at io.vertx.core.impl.ContextImpl$$Lambda$327/865803395.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

After the test ,All requests are unresponsive .
jstack log:

"mysql-cj-abandoned-connection-cleanup" #54 daemon prio=5 os_prio=0 tid=0x00007ff55c042800 nid=0xfaa in Object.wait() [0x00007ff60e3e8000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
	- locked <0x00000006ca58bd08> (a java.lang.ref.ReferenceQueue$Lock)
	at com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:85)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
	- <0x00000006ca58b8f8> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"vert.x-worker-thread-12" #53 prio=5 os_prio=0 tid=0x00007ff58c007000 nid=0xfa9 in Object.wait() [0x00007ff60e8e8000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:502)
	at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.awaitWriteable(VertxBlockingOutput.java:111)
	at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.write(VertxBlockingOutput.java:71)
	- locked <0x00000006c8739c58> (a io.vertx.core.http.impl.Http1xServerConnection)
	at io.quarkus.resteasy.runtime.standalone.VertxHttpResponse.writeBlocking(VertxHttpResponse.java:179)
	at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:69)
	at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:42)
	at org.jboss.resteasy.util.CommitHeaderOutputStream.write(CommitHeaderOutputStream.java:64)
	at org.jboss.resteasy.util.DelegatingOutputStream.write(DelegatingOutputStream.java:42)
	at org.jboss.resteasy.plugins.providers.jsonb.JsonBindingProvider.writeTo(JsonBindingProvider.java:137)
	at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.writeTo(AbstractWriterInterceptorContext.java:193)
	at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.writeTo(ServerWriterInterceptorContext.java:64)
	at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.proceed(AbstractWriterInterceptorContext.java:155)
	at org.jboss.resteasy.core.ServerResponseWriter.lambda$writeNomapResponse$2(ServerResponseWriter.java:156)
	at org.jboss.resteasy.core.ServerResponseWriter$$Lambda$417/1780592217.run(Unknown Source)
	at org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl.filter(ContainerResponseContextImpl.java:404)
	- locked <0x00000007bea82098> (a org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl)
	at org.jboss.resteasy.core.ServerResponseWriter.executeFilters(ServerResponseWriter.java:232)
	at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:97)
	at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:70)
	at org.jboss.resteasy.core.SynchronousDispatcher.writeResponse(SynchronousDispatcher.java:578)
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:508)
	at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:252)
	at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$332/1061176646.run(Unknown Source)
	at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:153)
	at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$338/1804746610.get(Unknown Source)
	at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:363)
	- locked <0x00000007bea82130> (a org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext)
	at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:156)
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:238)
	at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:73)
	at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:109)
	at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatchRequestContext(VertxRequestHandler.java:84)
	at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.lambda$handle$0(VertxRequestHandler.java:71)
	at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$$Lambda$313/2064123549.handle(Unknown Source)
	at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:316)
	at io.vertx.core.impl.ContextImpl$$Lambda$327/865803395.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
	- <0x00000006c8863b08> (a java.util.concurrent.ThreadPoolExecutor$Worker)

Expected behavior
(Describe the expected behavior clearly and concisely.)

Actual behavior
(Describe the actual behavior clearly and concisely.)

To Reproduce
Steps to reproduce the behavior:

  1. My program was modified from hibernate-orm-panache-resteasy in quickstart. I add a restful service returns all records of the table,there are 800 records in the table.
    @GET
    public List<Student> findAll() {
        List<Student> list = Student.findAll().list();
        if (list == null) {
            throw new WebApplicationException("Fruit  does not exist.", 404);
        }
        return list;
    }
  1. Then run in JVM mode.
  2. Did a simple stress test .
wrk -t12 -c2000 -d30s  http://10.32.1.***:8888/rest/students/

Configuration

# Add your application.properties here, if applicable.
quarkus.datasource.url=jdbc:mysql://10.32.1.***:3306/xyz
quarkus.datasource.driver=com.mysql.cj.jdbc.Driver
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-jdbc-mysql</artifactId>
    </dependency>

Screenshots
(If applicable, add screenshots to help explain your problem.)

Environment (please complete the following information):

  • Output of uname -a or ver:
    Linux ***** 3.10.0-862.3.3.el7.x86_64 Switch to the Maven distributed copy of the SubstrateVM annotations #1 SMP Fri Jun 15 04:15:27 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Output of java -version:
    java version "1.8.0_201"
    Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
    Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)

  • GraalVM version (if different from Java):

  • Quarkus version or git rev: 1.0.0.CR1

Additional context

The full jstack dump info
jstack-l.txt

@yuhaibohotmail yuhaibohotmail added the kind/bug Something isn't working label Nov 13, 2019
@geoand
Copy link
Contributor

geoand commented Nov 13, 2019

@johnaohara have you seen anything similar?

@johnaohara
Copy link
Member

@geoand looks similar to #4173

@geoand
Copy link
Contributor

geoand commented Nov 13, 2019

Which was fixed however

@johnaohara
Copy link
Member

@geoand Does your reproducer fail for master now? (https://github.com/geoand/quarkus-vertx-threads-blocked)

@geoand
Copy link
Contributor

geoand commented Nov 13, 2019

Let me check

@johnaohara
Copy link
Member

Have just re-run against 1.0.0.CR1 and is working as expected

@geoand
Copy link
Contributor

geoand commented Nov 13, 2019

I just ran it my reproducer against master and it worked just fine.

@johnaohara
Copy link
Member

I looks like there is a race condition in request.response().drainHandler(). I have opened a pr #5451.

@yuhaibohotmail
Copy link
Author

@johnaohara I've simplified my program and can reproduce it in several of my environments.

My endpoint:

@Path("/hello")
public class ExampleResource {

    @GET
    @Produces(MediaType.TEXT_PLAIN)
    public String hello() {
        StringBuffer buffer = new StringBuffer();
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        buffer.append("[");
        for(int i = 0 ;i <1000;i++ ) {
            buffer.append("{\"id\":1,\"name\":\"tom\",\"sex\":\"f\",\"testCode\":\"12345\"},");
        }
        buffer.append("{\"id\":1,\"name\":\"tom\",\"sex\":\"f\",\"testCode\":\"12345\"} ]");
        return buffer.toString();
    }
}

Run wrk

[root@localhost wrk]#  wrk -t20 -c2000 -d30s  http://localhost:8888/hello
Running 30s test @ http://localhost:8888/hello
  20 threads and 2000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   652.20ms   95.72ms   1.03s    79.64%
    Req/Sec   180.39    180.45     0.99k    85.29%
  37192 requests in 30.10s, 1.74GB read
  Socket errors: connect 999, read 0, write 0, timeout 0
Requests/sec:   1235.66
Transfer/sec:     59.21MB

When the wrk is running, the endpoint can be accessed through the browser.
Once the wrk is finished, the program reports some exception and then does not respond to the request.

2019-11-14 03:32:42,126 INFO  [io.quarkus] (main) load-test 1.0.0-SNAPSHOT (running on Quarkus 1.0.0.CR1) started in 1.222s. Listening on: http://0.0.0.0:8888
2019-11-14 03:32:42,146 INFO  [io.quarkus] (main) Profile prod activated.
2019-11-14 03:32:42,146 INFO  [io.quarkus] (main) Installed features: [cdi, resteasy]
2019-11-14 03:33:16,424 SEVERE [io.ver.cor.imp.ContextImpl] (vert.x-eventloop-thread-0) Unhandled exception: java.lang.IllegalMonitorStateException
        at java.lang.Object.notify(Native Method)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput$1.handle(VertxBlockingOutput.java:34)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput$1.handle(VertxBlockingOutput.java:25)
        at io.vertx.core.http.impl.HttpServerResponseImpl.handleClosed(HttpServerResponseImpl.java:616)
        at io.vertx.core.http.impl.HttpServerResponseImpl.handleException(HttpServerResponseImpl.java:590)
        at io.vertx.core.http.impl.HttpServerRequestImpl.handleException(HttpServerRequestImpl.java:577)
        at io.vertx.core.http.impl.Http1xServerConnection.handleClosed(Http1xServerConnection.java:413)
        at io.vertx.core.net.impl.VertxHandler.lambda$channelInactive$4(VertxHandler.java:162)
        at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:369)
        at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43)
        at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:232)
        at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:224)
        at io.vertx.core.net.impl.VertxHandler.channelInactive(VertxHandler.java:162)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
        at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:393)
        at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:358)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1417)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
        at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:913)
        at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:819)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
        at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

2019-11-14 03:33:16,425 SEVERE [io.ver.cor.imp.ContextImpl] (vert.x-eventloop-thread-0) Unhandled exception: java.lang.IllegalMonitorStateException
        at java.lang.Object.notify(Native Method)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput$1.handle(VertxBlockingOutput.java:34)

source code
load-test.zip

@johnaohara
Copy link
Member

@yuhaibohotmail thanks a lot for the reproducer, I will use it to test a potential fix.

@johnaohara
Copy link
Member

@yuhaibohotmail what hardware are you running on? I am not reproducing the error on my local machine, but will also test in our lab

@johnaohara
Copy link
Member

I can re-create now

@yuhaibohotmail
Copy link
Author

yuhaibohotmail commented Nov 14, 2019

@johnaohara Can you reproduce it now?

I tested it in two environments.

  1. Our private cloud: 8CPU/16GMemory/ Centos 7
    The code and wrk run on the same machine
  2. My VM: 1CPU/4GMemory/ Centos 7 . Run on VMware 15.5
    The code and wrk run on the same machine

@johnaohara
Copy link
Member

@yuhaibohotmail I have your test running without hanging the application. Now waiting on CI and approval for the PR. Thanks

@johnaohara
Copy link
Member

@yuhaibohotmail #5451 fixes this issue and will be included in the next release. Thank you for your help

@johnaohara johnaohara added this to the 1.1.0 milestone Nov 15, 2019
@gsmet gsmet removed the backport? label Nov 15, 2019
@gsmet gsmet modified the milestones: 1.1.0, 1.0.0.Final Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants