Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Quarkus initialization thread-safe in BaseFunction #23801

Merged
merged 3 commits into from
Feb 21, 2022
Merged

Make Quarkus initialization thread-safe in BaseFunction #23801

merged 3 commits into from
Feb 21, 2022

Conversation

rob-spoor
Copy link
Contributor

No description provided.

@rob-spoor
Copy link
Contributor Author

A colleague found https://github.com/quarkusio/quarkus/pull/14949/files, where the lazy loading was introduced. I had tried doing a reverted version of that PR (even though I wasn't aware it existed), but that resulted in class loading issues. My change is a mix of static initialization and lazy loading.

@patriot1burke
Copy link
Contributor

@rob-spoor So, there's a reason why I did initialization in the run() method, I don't remember why though. Did you actually test this on Azure Functions?

@patriot1burke
Copy link
Contributor

It might have been something like boot error messages weren't viewable or something. Sorry, I wrote this years ago and should have put in a comment.

@rob-spoor
Copy link
Contributor Author

@patriot1burke I have a test project in #23789 that showcases the issue. I've replaced the Quarkus version with a custom build, and then I no longer get any errors. That isn't run on Azure itself, but I did test the code as a local Azure function.

@patriot1burke
Copy link
Contributor

Yeah, I did this for a reason...But I can't remember Arg!!! If you look at the AWS Lambda code, its all initialized in a static block. I moved it to the run() method and lazily did it for some reason I can't remember.

@rob-spoor
Copy link
Contributor Author

You probably ran into the same issue as I did.

@patriot1burke
Copy link
Contributor

patriot1burke commented Feb 21, 2022

Ok, I remember now. See my comment in this thread:
#12551

IIRC, the Azure Functions runtime loads the Quarkus Function class without setting TCL classloader so there would be a ton of boot failures when quarkus tried to boot. When the Function.run() method is invoked by the Azure runtime, it DOES set the TCL correctly, so that that's why Quarkus is initialized lazily in the Function.run() method and not within a static block.

Keeping the old code and just changing the "started" and other fields to volatile may fix your problems.

@rob-spoor
Copy link
Contributor Author

@patriot1burke unfortunately it wouldn't. The check for the current application isn't safe enough. The current application is set very early on in the start method: https://github.com/quarkusio/quarkus/blob/main/core/runtime/src/main/java/io/quarkus/runtime/Application.java#L72. That means that a second request could read a non-null application even though Quarkus hasn't started up. The results is an NPE when the dispatch is executed. The actual line where the NPE occurs can be one of several, we've seen many different occurrences already (most of them in recorders).

I've listed the possible flows in #23789. We've seen both of the incorrect flows actually happen. Forcing the atomic loading by using the nested class from this PR only allows the desired success flow.

@patriot1burke
Copy link
Contributor

Actually, this will have to be put in a double-check block. I'll push a separate PR.

@rob-spoor
Copy link
Contributor Author

Are you sure that's the best solution? I've considered using that, but I remember that it can have its own issues. I've read https://en.wikipedia.org/wiki/Double-checked_locking#Usage_in_Java, and it lists the https://en.wikipedia.org/wiki/Initialization-on-demand_holder_idiom which I've used here.

@patriot1burke
Copy link
Contributor

Are you sure that's the best solution? I've considered using that, but I remember that it can have its own issues. I've read https://en.wikipedia.org/wiki/Double-checked_locking#Usage_in_Java, and it lists the https://en.wikipedia.org/wiki/Initialization-on-demand_holder_idiom which I've used here.

Double check has worked since Java 5, but your solution works too. Didn't realize you had used Initializaiton-on-demand.

@patriot1burke
Copy link
Contributor

Approved after CI passes.

@gsmet
Copy link
Member

gsmet commented Feb 21, 2022

@patriot1burke CI is green. All good for you?

@patriot1burke patriot1burke merged commit d178e03 into quarkusio:main Feb 21, 2022
@quarkus-bot quarkus-bot bot added this to the 2.8 - main milestone Feb 21, 2022
@patriot1burke
Copy link
Contributor

@rob-spoor Nice PR Rob! Sorry for the confusion before.

@gsmet gsmet modified the milestones: 2.8 - main, 2.7.2.Final Feb 21, 2022
@rob-spoor rob-spoor deleted the quarkus-azure-functions-http_race-condition-fix branch February 22, 2022 08:09
@rob-spoor
Copy link
Contributor Author

@patriot1burke not a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants