-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share cache entry between matching Jobs in different workflows #1017
Comments
Thanks for the report. Yes, now that the workflow name is part of the cache key you will no longer get a match for entries running with the same Job id in different workflows. This was intentional to avoid name collisions when Jobs in different workflows are coincidentally named the same. The standard mechanism will work well for sharing cache entries if you:
But I presume there are good reasons that this won't work for you. |
This issue raises an interesting point: I've generally thought of each Workflow as something quite standalone, but it makes sense that users could be composing different workflows in different ways from common set of Job actions. If
|
Thank you for the feedback and a thoughtful response!
💯 That's exactly our use case. On top of explicitly defining multiple jobs with the same id, jobs can also be easily reused across multiple workflows by means of reusable workflows, which, according to the docs, are executed in the context of the caller workflow.
This would probably solve the problem of reading cache entries by jobs with matching ids (both in case of jobs from different workflows and in case of jobs defined in a reusable workflow) 👍 As a side note, however, I would like to point out the possible ramifications of adding workflow name to the cache entry hash. Taking the above into consideration, would it be a good idea to initialize cache key with some sensible (for the majority of use cases and developers), sufficiently unique defaults (i.e. job id + workflow), while still allowing some form of control in specific (i.e. reusable) jobs (the same way it was previously possible by manipulating job id or |
Thanks for your feedback.
You're correct, and this can be problematic. The fundamental limitation is that GitHub actions cache is write-once, so once an entry is written for a key that entry cannot be modified or overwritten. Imagine the following scenario:
If these 2 'build' jobs share a cache key, then whichever workflow runs first will "win" and generate the key. If Workflow A runs first, then the cache entry won't contain any of the required dependencies for the 'assemble' task. This will mean that Workflow B will need to download all of it's dependencies on every execution.
Yes, this is exactly the intent. The idea is that the cache keys are sufficiently unique, but that the entries are de-duplicated to avoid redundancy.
Yes, I think it makes sense to allow this sort of control. For now the only mechanism is undocumented and crude ( |
I've just pushed v3.0.0-beta.3, which allows entries to be restored by Job ID without matching workflow name. You can try this out with |
Hi @bigdaz , I was testing v3.0.0-beta.3 and I was excepting such cache restore process:
Before:
After (v3 beta):
From my testing it seems that the workflow part ( |
@jaloszek During development, I opted against adding a new cache-restore-key. The updated cache-restore process is documented here: https://github.com/gradle/gradle-build-action?tab=readme-ov-file#finding-a-matching-cache-entry The workflow name is now encoded with the matrix values, which is why you no longer see it explicitly in the cache key. |
@bigdaz Thanks for explanation, it make sense. I believe that it would be worth to explain somewhere that hash includes workflow name and it is not part of the cache key in explicite way, it is rather hidden in the hashcode.
available here: https://github.com/gradle/gradle-build-action?tab=readme-ov-file#cache-keys Would be nice to at least remove ${workflow-name} part from the example. |
@jaloszek Thanks for pointing that out. I've updated the docs to reflect the new cache key format. |
The new release is working as expected 👍 Great job, thank you! |
Context:
We have multiple workflows:
ci:develop
, runs ondevelop
branch, writes multiple caches (one cache per job)ci:feature
, runs onfeature/
branches, reads multiple (read-only) caches (one cache per job)ci:firebase
, run on any branch, reads a single cacheAll workflows perform very similar operations (build, test). However, they do differ slightly: extra jobs/steps, i.e. uploading artifacts, comparing code coverage vs.
develop
, etc.All of these workflows define matching job ids (that allowed cache entry matching):
ci-build
ci-test
It was allowing
ci:feature
/ci:firebase
jobs to use matching cache entries fromci:develop
jobs (i.e.ci:feature/ci-test
was usingci:develop/ci-test
cache,ci:firebase/ci-build
was usingci:develop/ci-build
cache, etc).Problem:
We've noticed that we are getting cache misses and our
ci:feature
workflow is now using random/latest caches fromdevelop
branch, instead of the most appropriate / "matching" entries.This is probably related to changes introduced in #699 that started taking workflow name into consideration (as per the official docs).
It seems that the cache (or multiple caches) can now only be easily re-used within a single workflow.
Question:
We were previously able to reuse cache entries between different workflows by naming jobs appropriately.
Is a similar functionality still possible?
It seems that passing
restore-keys
manually is not possible. If I am not mistaken, using the undocumentedGRADLE_BUILD_ACTION_CACHE_KEY_JOB
env variable might be a possible workaround (using a hardcoded cache entry name), but I am not sure if this is the best solution (it's undocumented and would completely disable the default cache entry matching mechanism)?Are there any general recommendations on how gradle cache should be handled across multiple GH workflows now?
I.e. not sharing cache between different workflows at all? Populating a single gradle cache (by running all gradle tasks in a single job) that all other workflows will fallback to?
The text was updated successfully, but these errors were encountered: