-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub actions #1690
GitHub actions #1690
Conversation
This speeds up the build A LOT when none of the dependencies have been cached yet (e.g. when performing a docker build). Reason: without putting this repository first, all dependencies are first searched in all of the specified repositories and only afterwards the central repository is tried. Because most dependencies are not present in the specified repositories, still central needs to be queried afterwards. When putting the central repository first, this reduces time to retrieve dependencies in two ways: 1. The central repository has a very low latency, much lower than the other repositories. If a dependency can be fetched from here, it will be faster than when it is fetched from another one. 2. For most dependencies, it reduces the number of repositories queried from 7 to 1.
f306dad
to
5f23920
Compare
The result of a Github Actions PR can be found here: The result of a release action can be found here: And the release itself has the WebAPI.war asset added to it Tagging @leeevans and @anthonysena for their earlier review of the Dockerfile integration. |
@blootsvoets I've setup the GitHub repo secrets for this WebAPI & the Atlas repo and created the Docker Hub access token (thanks @msuchard for your help) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blootsvoets is there an easy way for folks to distinguish the Docker Hub CI build docker image from the Release build docker image? If not, should we use 'ohdsi/webapi-snapshot' for the CI build docker image name?
Thanks @leeevans. I've merged the master branch. Note that I did add back the maven central repo to the Snapshots are differentiated from releases by their tag. Releases have a tag that matches the Github Release, for example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blootsvoets thanks for updating with latest pom.xml file and answering my question about snapshot tracking.
pom.xml
Outdated
@@ -466,6 +467,10 @@ | |||
</build> | |||
|
|||
<repositories> | |||
<repository> | |||
<id>central</id> | |||
<url>https://repo.maven.apache.org/maven2</url> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @chrisknoll @leeevans - I was going to merge this set of changes in and noticed the addition of this repository. Isn't it now the case that any/all requests for dependencies will go to repo.ohdsi.org
's nexus and when not found there, it will source them (presumably through maven central) and therefore this <repository>
reference is no longer required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have re-read the PR. @blootsvoets mentioned:
Note that I did add back the maven central repo to the pom.xml, because this decreases the war build time from 3:46 to 1:26 minutes, making it 2.6 times faster to build this way.
Let me know if you have any concerns w/ this otherwise I'm fine to merge in these changes. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we should use the ohdsdi.org repo since that bridges all the repos to a single endpoint, and will be even faster than just putting central first (since it will go through central connection first then to ohdsi), but with the updated repositories element, it only needs to go to ohdsi maven and it will have all the central maven artifacts cached.
For refrence, this is the only thing we need in pom:
<repositories>
<repository>
<id>ohdsi</id>
<name>repo.ohdsi.org</name>
<url>http://repo.ohdsi.org:8085/nexus/content/groups/public</url>
</repository>
</repositories>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maven central uses a large CDN, so data is always stored relatively close to the person doing the build, and they have invested in fast networks and hardware to give the fastest possible response. This can empirically be verified by doing the docker build with and without central as first element, since this build cannot use the maven cache. It will also be the case for new contributors that do not yet have the dependencies cached yet. In terms of cost. So while indeed you don’t need central as the first repository, it does save time, costs and energy.
If you still prefer it to be removed, let me know and I’ll do it ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting point about the CDN, but I'm not sure it will make a big difference in practice. According to this reference from Apache, it is a best practice to use a repository manager with a proxy to the public maven repositories, and they call out the build performance when dealing with snapshots. But, I think you can also set up your own local settings.xml file to reference your own preferred maven repositories, described in this stack overflow post.
I want to be accomodating, but we've gone through a few iterations on configuring repositories in the pom.xml and using central. The biggest impact I've seen is that when doing local builds during development (and we're referencing snapshots) each build wants to make sure latest snapshot is available, and it would go out to a number of other repositories before finally getting to the ohdsi.org's repo and finding out that nothing has changed. This added at least 30-60 seconds to the build, which when you're doing multiple builds per hour, it really adds up. I haven't checked if we put central + ohdsi nexus into the pom if it leads to a slower build time...but using the single ohdsi nexus seems to be working ok.
So, did you want to try to see if using a pom.xml with central vs. without using it to determine if it has any impact on your docker build times? Also see if you can configure your own settings.xml to add central to it to see if you get the desired result of accessing central first before ohdsi nexus? If you are still experiencing problems even after using the ohdsi group http://repo.ohdsi.org:8085/nexus/content/groups/public
, I'll check to see if adding the central to the pom.xml impacts build times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, yes I did try the difference. I've now committed a change where the central repo is only used in the docker build. Would that alleviate your concerns? As mentioned in the first post, using central as a first repository reduces the WAR build time from 3:46 to 1:26.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that works for me. Thanks.
An unrelated question about repo.ohdsi.org: would it be possible to use HTTPS there, for example using letsencrypt? Gradle has already phased out support for non-HTTPS maven repositories mainly to guard against man in the middle attacks: https://blog.gradle.org/decommissioning-http |
@blootsvoets yes, I will be moving repo.ohdsi.org to HTTPS in the next couple of weeks. I'll coordinate with @anthonysena and @chrisknoll so the repo reference in the pom.xml can be updated at the appropriate time |
The POM has been modified as requested. @anthonysena or @chrisknoll would you please take another look? |
Looks good. @anthonysena : if your 'change request' is satisfied, go ahead and merge at your convenience. |
* Puts Maven Central as the first repository This speeds up the build A LOT when none of the dependencies have been cached yet (e.g. when performing a docker build). Reason: without putting this repository first, all dependencies are first searched in all of the specified repositories and only afterwards the central repository is tried. Because most dependencies are not present in the specified repositories, still central needs to be queried afterwards. When putting the central repository first, this reduces time to retrieve dependencies in two ways: 1. The central repository has a very low latency, much lower than the other repositories. If a dependency can be fetched from here, it will be faster than when it is fetched from another one. 2. For most dependencies, it reduces the number of repositories queried from 7 to 1. * Enabled github actions * Use dedicated docker profile to avoid long command line * Backport github action fixes * Fix typo * Remove unsupported platform * Only use Maven central in the webapi-docker profile
* prepare for release 2.8.0 * 2.8.1-SNAPSHOT * Update CIRCE to 1.9.1 (cherry picked from commit 9fac161) * Remove trailing ; from DELETE (#1752) Fixes #1730. (cherry picked from commit 66f9100) * PLP/PLE Analysis result file missed analysis executable code #1729 (#1751) * removed tomcat dependency from hive profile (#1748) changed default runtime delegate (#1720) Co-authored-by: Sergey Suvorov <[email protected]> * Issue 1733 build error without git dir (#1757) * git-commit-id-plugin does not throw exception in case of absent .git folder if git_commit-id-plugin can not get git info parameters then gmaven-plugin creates these parameters with default value ("*") (#1733) * removed temporary dependencies (#1733) Co-authored-by: Sergey Suvorov <[email protected]> * Update org.apache.httpcomponents:httpclient version from 4.5.10 to 4.5.13 (#1747) * eager loading is now used during conversion of entities to dto (#1763) fixes #1758 Co-authored-by: Sergey Suvorov <[email protected]> # Conflicts: # pom.xml * GitHub actions (#1690) * Puts Maven Central as the first repository This speeds up the build A LOT when none of the dependencies have been cached yet (e.g. when performing a docker build). Reason: without putting this repository first, all dependencies are first searched in all of the specified repositories and only afterwards the central repository is tried. Because most dependencies are not present in the specified repositories, still central needs to be queried afterwards. When putting the central repository first, this reduces time to retrieve dependencies in two ways: 1. The central repository has a very low latency, much lower than the other repositories. If a dependency can be fetched from here, it will be faster than when it is fetched from another one. 2. For most dependencies, it reduces the number of repositories queried from 7 to 1. * Enabled github actions * Use dedicated docker profile to avoid long command line * Backport github action fixes * Fix typo * Remove unsupported platform * Only use Maven central in the webapi-docker profile * ancestorAndDescendant:get permission is now added to new sources (#1766) * ancestorAndDescendant:get permission is now added to new sources fixes #1759 * ancestorAndDescendant:get permission is now added to new sources fixes #1759 * fixed identation fixes #1759 * ancestorAndDescendant:get permission is not added to new sources #1759 - fix permission name Co-authored-by: Sergey Suvorov <[email protected]> Co-authored-by: Anton Abushkevich <[email protected]> * ir generation results are now exported correctly (#1767) queryForRowSet function calls isSigned function which is not implemented for hive fixes #1765 Co-authored-by: Sergey Suvorov <[email protected]> * Fix sql server migration. Revert a721214. Remove IF EXISTS from sql server. * Update dependency to arachne-commons release 1.17.1 #1774 (#1775) * Eager loading without redundant calls for fetching collecions works only in 5.4.2.Final version of hibernate(#1758) # Conflicts: # pom.xml * Reset Impala settings to default(#1758) * Updated org.ohdsi from 1.9.1 to 1.9.2 to to fix error in script(#1755) (#1778) Co-authored-by: Sergey Suvorov <[email protected]> * Updated org.apache.shiro from 1.6.0 to 1.7.1 to eliminate identified vulnerabilities (#1759) (#1776) Co-authored-by: Sergey Suvorov <[email protected]> * Version 2.8.1 Co-authored-by: anton.abushkevich <anton.abushkevich> Co-authored-by: Chris Knoll <[email protected]> Co-authored-by: anton-abushkevich <[email protected]> Co-authored-by: Sergey Suvorov <[email protected]> Co-authored-by: Joris Borgdorff <[email protected]>
Enabled GitHub actions. It requires repository secrets
to be set.
Changes to the POM file: