-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up docker image building and switch base image to alpine #17731
base: master
Are you sure you want to change the base?
Conversation
I was taking a look and was wondering about the following:
|
the BUILD_FROM_SOURCE is a legacy feature that I didn't make change and keep it. However, this is how I built the docker image on my M1 when building docker directly takes long time. The problem for M1 is that buliding a docker images is divided into two steps, first building distribution jar on host machine, then using this environment to build a docker image. This should be fixed because sometimes i even can't remember i should follow these two steps to get a docker image.
The core problem here is that web-console is different from backend services, it's a front-end project that has its own building toolchain
This is why I made some changes to the web-console module so that we can use the docker cache |
yeah - things could be complicated ; maybe it would be usefull to place a script under the
I wonder if there is a way to convince maven to not rebuild that all the time; that comes up a lot of other places as well..so fixing it more deeply could address those as well.. Thank you for the insights - I think the best would be to re-pack the dist tarball which was produce outside docker (make |
There are several problems in the Dockerfile
1. Extreme slow building on Apple Silicon Chips
Previously, to allow building docker on Apple Silicon Chips like M1, the docker file forces the building under the amd64 platform. This is to address the building problem that node-sass does not support ARM, see #13012
However, this drastically slows down the docker building process on these platforms, like it takes more than 15 minutes to build an image on my M1 laptop.
The main reason is that Apple has to use x86 emulator to run the building process.
2. Unfriendly to debug
Currently the distroless base image is used, it's a secure image but it's unfriendly to debug. there's no curl, no wget, no lsof, and nettools. It's painful to debug if we have to debug some live issues.
3. web-console is repeatedly built even if it's not changed
Most of the development does not involve the web-console module, now it's part of the building process of other backend services when using
mvn package
command.Since the web-console module take time to build, it also slows down the building process
And there're some other problems which are described in the following section.
Changes Description
The entire building process is split into two stages, the web-console build stage which runs under amd64 platform, and the distribution building stage which adapts local development platform. And during the distribution building stage, the web-console will be copied for final distribution package.
This improves the building process drastically. Now on my laptop, it takes 120 seconds to complete the web-console building stage, and 210 seconds to complete the backend service building stage which are acceptable.
mvn
to build web-consoleThis can greatly improve the building performance when contents under web-console directory are not changed by leveraging the docker cache.
To make it, we bulid the web-console in a node image directly. In development, when web-console module is not changed, this reduces the entire building process of web-console
Previously, the
maven:3.9
, which comes with JDK17, is used for building stage. This does NOT respect theJDK_VERSION
argument in the docker file. This means if we're going to build druid in 21 by specifying the JDK_VERSION, the distribution was still buit under JDK17 but packaged to run in JRE 21 environment.In this PR, this is fixed. The buliding stage and final image use the SAME version of JDK
gcr.io/distroless/java$JDK_VERSION-debian12
toalpine
This also drastically simplifies the docker file. Previously, we have to install busybox, download bash from somewhere in the Dockerfile, which makes the Dockerfile very complicated.
Since alpine comes with shell, these steps are eliminated. The change does NOT involve size bloat of image. On my local it shows that size of alpine based image is 746MB which is a little bit smaller than that of distroless image.
And some command used tools like curl,lsof,netools are packaged in the final docker image.
Previously we use the following command to evaluate the version, but this step takes VERY LONG time on my laptop
We can see that after 254 seconds, the command is still running.
This is eliminated because by applying 'clean' to the maven command, we ensure that there's only one tar file under the distribution and we can use wild match to find the file and decompress it
test-related modules are execluded from distribution stage.
druid.sh
is also updated to ensuredruid.host
has value before starting java process. This helps exposing problem more earlier.Release note
The default image is switched from
gcr.io/distroless/java17-debian12
toalpine
This PR has: