Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repeating downloads of scala library #147

Open
ittaiz opened this issue Feb 21, 2017 · 6 comments
Open

repeating downloads of scala library #147

ittaiz opened this issue Feb 21, 2017 · 6 comments

Comments

@ittaiz
Copy link
Member

ittaiz commented Feb 21, 2017

Hi,
I have a problem where for some cases when I build some projects bazel downloads the scala library all over again.
The above description is vague since I'm not entirely sure on the circumstances and it very well might be an edge case of me working on many WORKSPACES with different bazel versions (release from brew and local dev).
Has anyone encountered this? Anything that can be done in this repo?
Sounds related to bazelbuild/bazel#1752 but not sure

@johnynek
Copy link
Member

I do see it too. It is pretty frustrating.

It may also be related to: bazelbuild/bazel#2490

I have not tried the repository cache, but you can see reports of success here:
bazelbuild/bazel#1752 (comment)

@ittaiz
Copy link
Member Author

ittaiz commented Mar 8, 2017

I think, not sure, that this is related to test_run.sh
It seems that if I run bazel test after running it then bazel needs to kill the unresponsive server and then do some expensive stuff. Might be wrong. Don't know why yet

@softprops
Copy link
Contributor

Is anyone looking at this? We're running into the same issue --experimental_repository_cache does look promising. I noticed the appengine rules recommended it.

@ittaiz
Copy link
Member Author

ittaiz commented Apr 14, 2017

@softprops What is your use-case? for me this happened a lot when I was switching between running test_run.sh and bazel test for rules_scala itself. I figured it was related to how test_run.sh is built but maybe this is a bigger issue.

@softprops
Copy link
Contributor

My use case is running builds in ci. I'm caching the initial bazel installation in a container which is then used to run build scripts which call bazel. We're now caching build results with bazels remote cache suppprt but the last piece is to cache the loading of skylark rule defs and the artifacts they depend on. For each one of these ci builds, skylark rules are redownloaded along with the scala jar deps. I recently started investigating that and bumped into this thread.

@softprops
Copy link
Contributor

Wanted to share a technique I've found helpful to reduce this download time and reduce chances of network errors connecting to github ( that has happened multiple times ) in our ci env.

We run bazel in a docker container whose image has already built a dummy workspace in a well known dir with scala rules and a few other deps.

When we run the container, we mount a checkout, for reasons, in a slightly different dir that's symlinked to the well known dir. Before we build we run the following which symlinks a cache dir to the one that's baked into the image. The net result is we no longer have to download rule defs or any of their dependencies on any of our ci builds. This has improved the reliability of our builds.

cd /well/known/dir

CI_WORKSPACE_HASH=$(echo -n "/well/known/dir" | md5sum | cut -d ' ' -f 1)
THIS_WORKSPACE_HASH=$(echo -n $(pwd -P) | md5sum | cut -d ' ' -f 1)
CACHE_DIR=${HOME}/.cache/bazel/_bazel_$(whoami)
ln -s ${CACHE_DIR}/$CI_WORKSPACE_HASH ${CACHE_DIR}/${THIS_WORKSPACE_HASH}

There may be a more straightforward way to do this but this has worked very well for us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants