Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored the docker build script to only only include the gatk bundle in order to shrink the docker image size #4955

Merged
merged 2 commits into from
Jul 6, 2018

Conversation

jamesemery
Copy link
Collaborator

fixes #3930

@jamesemery jamesemery requested a review from lbergelson June 27, 2018 14:20
@lbergelson
Copy link
Member

@jamesemery How big was the docker and how big is it now?

@jamesemery
Copy link
Collaborator Author

jamesemery commented Jun 27, 2018

@lbergelson
now: 3.9GB (without ~0.3-5 GB extra off from the base image changes currently in the branch)
then: 5.22GB
A moderate but not irrelevant reduction in size. With these changes some more drastic changes will be easier, namely trying to slice down the python packages or refactoring the build jars should be easier. I intend to open another PR to upgrade the base image shortly.

@droazen
Copy link
Contributor

droazen commented Jun 27, 2018

@jamesemery So to clarify: including the size reductions to the base image, the total reduction will be around 1.6-1.8 GB, correct?

@codecov-io
Copy link

codecov-io commented Jun 27, 2018

Codecov Report

Merging #4955 into master will decrease coverage by 20.624%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##              master     #4955        +/-   ##
================================================
- Coverage     80.797%   60.173%   -20.624%     
+ Complexity     17952     12779      -5173     
================================================
  Files           1095      1095                
  Lines          64609     64609                
  Branches       10394     10394                
================================================
- Hits           52202     38877     -13325     
- Misses          8379     21497     +13118     
- Partials        4028      4235       +207
Impacted Files Coverage Δ Complexity Δ
...institute/hellbender/cmdline/TestProgramGroup.java 0% <0%> (-100%) 0% <0%> (-3%)
...er/utils/smithwaterman/SWNativeAlignerWrapper.java 0% <0%> (-100%) 0% <0%> (-6%)
...llbender/engine/filters/CountingVariantFilter.java 0% <0%> (-100%) 0% <0%> (-18%)
...ne/programgroups/CoverageAnalysisProgramGroup.java 0% <0%> (-100%) 0% <0%> (-3%)
...mlineContigPloidyHybridADVIArgumentCollection.java 0% <0%> (-100%) 0% <0%> (-3%)
...ecaller/graphs/DeadEndKBestSubHaplotypeFinder.java 0% <0%> (-100%) 0% <0%> (-10%)
...sformers/MisencodedBaseQualityReadTransformer.java 0% <0%> (-100%) 0% <0%> (-4%)
...ctions/RequiredFeatureInputArgumentCollection.java 0% <0%> (-100%) 0% <0%> (-1%)
...ls/copynumber/gcnv/GermlineCNVVariantComposer.java 0% <0%> (-100%) 0% <0%> (-4%)
...groups/StructuralVariantDiscoveryProgramGroup.java 0% <0%> (-100%) 0% <0%> (-3%)
... and 583 more

@jamesemery
Copy link
Collaborator Author

@droazen correct.
Generally, one issue is that this slows down the docker image creation in a somewhat substantial way. Around 10 minutes currently. Half of this is unzipping the bundled jar, and another piece is some redundant gradle downloading that can be alleviated with cache shenanigans.

@jamesemery jamesemery force-pushed the je_newShrinkDockerSize branch 2 times, most recently from 4a44ec7 to f2c2253 Compare July 6, 2018 15:14
Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesemery A bunch of cleanup type things. I do think you should rename that second gradle script. Sharing code between them might be tricky, but at least put a comment in.

Looks good. Ready to merge once the final changes are resolved and the base image is updated.

.travis.yml Outdated
@@ -20,7 +20,8 @@ env:
#gradle needs this
- TERM=dumb
#limit gradle jvm memory and disable daemon
- GRADLE_OPTS="-Xmx1024m -Dorg.gradle.daemon=false"
- GRADLE_OPTS="-Xmx1024m"
#Dorg.gradle.daemon=false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

apt-get upgrade -y && \
apt-get install -y python && \
apt-get install -y python3-pip && \
apt-get install --no-install-recommends -y wget curl unzip gcc python-dev python-setuptools git less lynx ## hdfview && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove hdfview

build_docker.sh Outdated
@@ -113,8 +112,22 @@ if [ -n "${IS_PUSH}" ]; then
else
RELEASE=false
fi
./gradlew clean bundle pythonPackageArchive shadowTestClassJar shadowTestJar -Drelease=$DRELEASE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shouldn't do all the test jar stuff if you're not running tests

build.gradle Outdated
* cloud, integration, unit : run one of the three disjoint partitions of the test suite
* all : run all the tests
* anything else : run the non-cloud tests
* cloud, integration, unit, spark, python, bucket, funcotatorValidation : run one of the several disjoint partitions of the test suite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucket and funcotatorValidation aren't valid options to TEST_TYPE

.travis.yml Outdated
DOCKER_TAG=$TRAVIS_COMMIT;
fi;
sudo docker images;
echo ${TEST_TYPE};
sudo mkdir -p build/reports/;
sudo chmod -R a+w build/reports/;
sudo docker run -v $(pwd)/src/test/resources:/testdata -v $(pwd)/build/reports/:/gatk/build/reports/ --rm -e "TEST_VERBOSITY=minimal" -e "TEST_TYPE=${TEST_TYPE}" -t broadinstitute/gatk:${DOCKER_TAG} bash --init-file /gatk/gatkenv.rc /root/run_unit_tests.sh;
sudo docker run -v $(pwd):/gatksrc -v $(pwd)/testJars:/jars --rm -e "TEST_VERBOSITY=minimal" -e "TEST_TYPE=${TEST_TYPE}" -t broadinstitute/gatk:${DOCKER_TAG} bash --init-file /gatk/gatkenv.rc /root/run_unit_tests.sh;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better named gatk_root? its not just the source is it?


plugins {
id "java" // set up default java compile and test tasks
id 'maven'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt we need maven here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need it to download jacoco? we point to maven central still.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maven plugin is for releasing to maven, not downloading from it. That's a core gradle thing I think.

buildscript {
repositories {
mavenCentral()
jcenter() // for shadow plugin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop jcenter from this one

}

repositories {
mavenCentral()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this whole block I think.


task unpackTestJar(type: Copy){
String testClassesJar = "$System.env.TEST_JAR"
String testClassesDir = "$System.env.CP_DIR"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining why this is necessary for future generations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and a description line

build.gradle Outdated
}
}

tasks.withType(Test) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put this back as just a modification of the test task since there's no second one here anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd actually rather leave it, because we are using warnings i would rather the two test methods that need to be updated in lock step are byte identical so its as easy as possible for someone to update.

@jamesemery jamesemery force-pushed the je_newShrinkDockerSize branch from 0f0958d to 63ec7e4 Compare July 6, 2018 19:52
@droazen droazen merged commit d6a7fcd into master Jul 6, 2018
@droazen droazen deleted the je_newShrinkDockerSize branch July 6, 2018 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor the docker image to correspond to the minimum necessary for user execution
4 participants