Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.5.4 upgrade, issue with actuator/metrics/AutoTimer, blocks startup #27809

Closed
dimcookies opened this issue Aug 24, 2021 · 5 comments
Closed
Labels
for: external-project For an external project and not something we can fix status: declined A suggestion or change that we don't feel we should currently apply

Comments

@dimcookies
Copy link

dimcookies commented Aug 24, 2021

We have a Spring Boot application (webflux, r2dbc, spring data) using the version 2.5.3. After upgrading to 2.5.4 the application halts at startup. From the thread dump it seems that a thread is blocked during the initialization of AutoTimer in metrics.

"reactor-tcp-epoll-1" #32 daemon prio=5 os_prio=0 cpu=122,91ms elapsed=8,35s tid=0x00007f8404f851e0 nid=0xa098 waiting for monitor entry  [0x00007f84747d8000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:217)
	- waiting to lock <0x000000060ec1ef78> (a java.util.concurrent.ConcurrentHashMap)
	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
	at org.springframework.beans.factory.config.DependencyDescriptor.resolveCandidate(DependencyDescriptor.java:276)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1380)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory$DependencyObjectProvider.getObject(DefaultListableBeanFactory.java:1979)
	at org.springframework.boot.actuate.autoconfigure.metrics.data.RepositoryMetricsAutoConfiguration$$Lambda$840/0x00000008011e8b20.get(Unknown Source)
	at org.springframework.util.function.SingletonSupplier.get(SingletonSupplier.java:97)
	- locked <0x000000061115e180> (a org.springframework.util.function.SingletonSupplier)
	at org.springframework.boot.actuate.metrics.data.MetricsRepositoryMethodInvocationListener.lambda$afterInvocation$0(MetricsRepositoryMethodInvocationListener.java:87)
	at org.springframework.boot.actuate.metrics.data.MetricsRepositoryMethodInvocationListener$$Lambda$1194/0x0000000801315700.accept(Unknown Source)
	at org.springframework.boot.actuate.metrics.AutoTimer.apply(AutoTimer.java:109)
	at org.springframework.boot.actuate.metrics.data.MetricsRepositoryMethodInvocationListener.afterInvocation(MetricsRepositoryMethodInvocationListener.java:86)
	at org.springframework.data.repository.core.support.RepositoryInvocationMulticaster$DefaultRepositoryInvocationMulticaster.notifyListeners(RepositoryInvocationMulticaster.java:76)
	at org.springframework.data.repository.core.support.RepositoryMethodInvoker$ReactiveInvocationListenerDecorator.lambda$decorate$7(RepositoryMethodInvoker.java:248)
	at org.springframework.data.repository.core.support.RepositoryMethodInvoker$ReactiveInvocationListenerDecorator$$Lambda$1047/0x0000000801233a30.apply(Unknown Source)

Drilling down a bit more, it seems that the problem is specifically with the AutoTimer support in spring data repositories, so if you disable that, the application starts normally.

management.metrics.data.repository.autotime.enabled= false

Unfortunatelly i could not reproduce that in a clean project with the same depedencies. Also the same issue appears in similar applications that have spring data but with cassandra connectivity this time.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Aug 24, 2021
@wilkinsona
Copy link
Member

Thanks for the report. Unfortunately, if you're unable to reproduce the problem with a complete application to base your attempt on, I think it's quite unlikely that we'll be able to figure it out from just the stack of a single thread.

Making a somewhat educated guess, it appears that you're using a Spring Data Repository during application startup which is unusual. If you'd like us to spend some more time investigating, a sample that reproduces the problem really would be ideal. Failing that, we might be able to figure things out from a full thread dump during the hang along with a detailed description of what the application does and how it uses Spring Data repositories during startup.

@wilkinsona wilkinsona added the status: waiting-for-feedback We need additional information before we can continue label Aug 24, 2021
@dimcookies
Copy link
Author

dimcookies commented Aug 24, 2021

Hi @wilkinsona , thank you for your prompt reply. You are right, the problem has to do with some caching that is performed during the startup. Based on that i was able to reproduce that in a clean project (attached). You can check the stacktrace of the thread dump and that if you change the version to 2.5.3 or disable autotime, the application starts successfully

The db schema:

CREATE TABLE configuration.config_param (
    external_id character varying(100) NOT NULL,
    value character varying(100) NOT NULL,
    last_update timestamp without time zone
);

demo.zip

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Aug 24, 2021
@wilkinsona
Copy link
Member

The change in behaviour in 2.5.4 is due to #27591.

This is a very similar to spring-projects/spring-framework#20904. In this case it's the deferred bean lookup in MetricsRepositoryMethodInvocationListener that's the trigger rather than event publication, but it's essentially the same problem.

Generally speaking, you should avoid doing anything multi-threaded during application context refresh where there's any chance that one of those threads will interact with the bean factory. In this case, that includes calling a reactive Spring Data repository that performs network IO. You may want to consider using a CommandLineRunner rather than @PostConstruct or using a non-reactive repository for this sort of initialization during application context refresh.

We can take another look at the fix for #27591 as there was an alternative fix that we rejected at the time. I'm not sure if it'll fix this problem, and even if it did we may not want to switch to it, but it's worth considering at least.

@wilkinsona wilkinsona added for: team-meeting An issue we'd like to discuss as a team to make progress and removed status: feedback-provided Feedback has been provided labels Aug 24, 2021
@dimcookies
Copy link
Author

@wilkinsona thank you again for your enlightening answer. We will consider our options and delay these calls after startup has been completed

@wilkinsona
Copy link
Member

We discussed this today and decided that we don't want to risk destabilising things by trying to workaround something that is really a Framework problem. Using a CommandLineRunner or a non-reactive repository is your best bet for now.

@wilkinsona wilkinsona added for: external-project For an external project and not something we can fix status: declined A suggestion or change that we don't feel we should currently apply and removed status: waiting-for-triage An issue we've not yet triaged for: team-meeting An issue we'd like to discuss as a team to make progress labels Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for: external-project For an external project and not something we can fix status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

No branches or pull requests

3 participants