Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache: Add immutable input to allow writing to cache #3090

Closed
wants to merge 1 commit into from
Closed

Cache: Add immutable input to allow writing to cache #3090

wants to merge 1 commit into from

Conversation

JohnHolmesII
Copy link

@JohnHolmesII JohnHolmesII commented Aug 24, 2020

I've been needing this for a long time, really since the Cache task was still in beta. I've been looking through the issues and the docs, and I've concluded that this kind of functionality really needs to be here, in a dedicated form.

The docs compare and contrast Artifacts with Caching, saying Artifacts are for necessary outputs from one stage/pipeline to another, and Caching is for dependencies which might be acquirable elsewhere. It seems to me there is a third type, and that is actual file caching of compiled output.

In looking around, I saw this comment which mentions a "rolling" cache idea, but the poster seems to be out-thinking themselves with regards to the keys. Most build systems (e.g. make, msbuild, etc) already check for outdated output and do incremental recompiling. I see no need for any kind of commit sha checking. Simply allow users to save the data, and let the build systems handl the rest.

I also saw an aside about security concerns. I can't really imagine how or why that's an issue, but at least by adding an input that is defaulted to read-only, there shouldn't be any worries about issues popping up for various pipelines.

Let me do my best to emphasize that this is desperately needed. Build times are absurd without caching. I wish msbuild had a more robust solution than what it has now, but we can't even use that without a mutable cache. The fallback key system is too clumsy, and it can lead to odd behavior (if we write to a specific key, and read from a general key, then we have to import all previous caches. If those caches have the same files, which files wind up being used?).

Please give this PR some consideration, I would really appreciate it.

p.s. I was unable to build either side of this, so it isn't tested. If it is wildly wrong, I will gladly work to get it right.

Cross PR: microsoft/azure-pipelines-tasks#13459

@ghost
Copy link

ghost commented Aug 24, 2020

CLA assistant check
All CLA requirements met.

@JohnHolmesII
Copy link
Author

The agent guidelines don't mention anything, but the tasks contributing guidelines say to tag whoever has commits on the files, so here goes: @fadnavistanmay

(I'm sorry if I'm not supposed to ping, hope it's not a bother 😄)

@johnterickson
Copy link
Contributor

@JohnHolmesII Unless I am missing something, the task already has the functionality that you need with restoreKeys:

inputs:
    key: 'NumberOfTimesIHadToResetTheCache=0 | KeepRolling | $(Build.SourceVersion)'
    restoreKeys: |
       NumberOfTimesIHadToResetTheCache=0 | KeepRolling

When there is not exact match for the key, the cache will restore the most recent entry from NumberOfTimesIHadToResetTheCache=0 | KeepRolling | *

@mgrilec
Copy link

mgrilec commented Apr 25, 2023

@JohnHolmesII Unless I am missing something, the task already has the functionality that you need with restoreKeys:

inputs:
    key: 'NumberOfTimesIHadToResetTheCache=0 | KeepRolling | $(Build.SourceVersion)'
    restoreKeys: |
       NumberOfTimesIHadToResetTheCache=0 | KeepRolling

When there is not exact match for the key, the cache will restore the most recent entry from NumberOfTimesIHadToResetTheCache=0 | KeepRolling | *

This is not the same as having a mutable cache. With the current implementation, when there's a hit, the cache won't get updated.

A use case for a mutable cache is a build system caching its results, ie. I'm using nx with its output cache. Whenever I run a command (lint or build), nx stores the result in its cache. The cache will fill itself up with pipeline runs as various parts of the code get modified, making each subsequent run faster until it's fully built up. When specific files are changed, the cache would get cleared.

I'm not sure about the implementation, but the motivation and described features would fit us perfectly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants