Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hashFiles of a non-wildcard reads the whole tree #319

Closed
michaelfig opened this issue Feb 5, 2020 · 5 comments
Closed

hashFiles of a non-wildcard reads the whole tree #319

michaelfig opened this issue Feb 5, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@michaelfig
Copy link

michaelfig commented Feb 5, 2020

Refs Agoric/agoric-sdk#502
Refs actions/cache#177

Describe the bug
I expect hashFiles('yarn.lock') only to read $workspace/yarn.lock.

Instead, it tries to read the entire working directory subtree. That's unacceptable when a build may create millions of files.

To Reproduce
Steps to reproduce the behavior:

  1. Have a cache like:
    - name: cache node modules
      uses: actions/cache@v1
      with:
        path: ~/.cache/yarn
        key: ${{ runner.os }}-yarn-${{ hashFiles('yarn.lock') }}
        restore-keys: |
          ${{ runner.os }}-yarn-
  1. Create millions of files during a later step
  2. Run the action at least twice to prime the cache
  3. See that in the post job cleanup all those directory entries were scanned, causing the step to waste time (in my case, minutes)

Expected behavior

No time-consuming traversal of the entire directory tree just to read one file. That single hash takes 00:01:45 in a big tree when it should be milliseconds.

Runner Version and Platform

Version of your runner? Github.com current

OS of the machine running the runner? Ubuntu

What's not working?

Cache plugin cleanup takes ages even though there are no changes (cache hit).

Job Log Output

From https://github.com/Agoric/agoric-sdk/actions/runs/34932285

Wed, 05 Feb 2020 06:17:00 GMT
##[debug]....Evaluating runner:
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]....=> Object
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]....Evaluating String:
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]....=> 'os'
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]..=> 'Linux'
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]..Evaluating hashFiles:
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]....Evaluating String:
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]....=> 'yarn.lock'
Wed, 05 Feb 2020 06:17:00 GMT
##[debug]Search root directory: '/home/runner/work/agoric-sdk/agoric-sdk'
Wed, 05 Feb 2020 06:17:18 GMT
##[debug]Search pattern: '/home/runner/work/agoric-sdk/agoric-sdk/yarn.lock'
Wed, 05 Feb 2020 06:18:16 GMT
##[debug]Found 2525470 files

Note how the above line took almost a full minute to execute.

Here's the rest of the step:

Wed, 05 Feb 2020 06:18:16 GMT
##[debug]1 matches to hash
Wed, 05 Feb 2020 06:18:16 GMT
##[debug]Hash /home/runner/work/agoric-sdk/agoric-sdk/yarn.lock
Wed, 05 Feb 2020 06:18:16 GMT
##[debug]Final hash result: 'c9a52e06293c448028063bd3e12465978bf2a97846ee469acd64cf676678609e'
Wed, 05 Feb 2020 06:18:16 GMT
##[debug]..=> 'c9a52e06293c448028063bd3e12465978bf2a97846ee469acd64cf676678609e'
Wed, 05 Feb 2020 06:18:16 GMT
##[debug]=> 'Linux-yarn-c9a52e06293c448028063bd3e12465978bf2a97846ee469acd64cf676678609e'
  Complete job

Runner and Worker's Diagnostic Logs

N/a

@michaelfig michaelfig added the bug Something isn't working label Feb 5, 2020
michaelfig added a commit to Agoric/agoric-sdk that referenced this issue Feb 5, 2020
We only need `packages/cosmic-swingset/go.sum` for Golang and
`yarn.lock` for Yarn.

Note that this is still expensive until actions/runner#319 is
fixed.
@michaelfig
Copy link
Author

michaelfig commented Feb 5, 2020

I would suggest that the hashFiles implementation could find the shortest common non-wildcard prefix of the supplied patterns (stopping at a segment containing wildcards), and only read the actual filesystem starting at that prefix.

So:

  • /home/runner/work/agoric-sdk/agoric-sdk/mydir/** would read the tree starting at /home/runner/work/agoric-sdk/agoric-sdk/mydir
  • /home/runner/work/agoric-sdk/agoric-sdk/yarn.lock would read the tree starting at /home/runner/work/agoric-sdk/agoric-sdk/yarn.lock
  • /home/runner/work/agoric-sdk/agoric-sdk/**/yarn.lock would read the tree starting at /home/runner/work/agoric-sdk/agoric-sdk
  • /home/runner/work/agoric-sdk/agoric-sdk/mydir/**\n/home/runner/work/agoric-sdk/agoric-sdk/yarn.lock would read the tree starting at /home/runner/work/agoric-sdk/agoric-sdk

@michaelfig
Copy link
Author

I see that this was fixed in #268

Where would I find out the ETA for Github Actions to be upgraded to this version?

@ericsciple
Copy link
Collaborator

@TingluoHuang do you know whats the state of the runner release that contains glob fixes?

@TingluoHuang
Copy link
Member

The runner is in ring 0, but we need service change to deploy in order to control the rollout

@TingluoHuang
Copy link
Member

this should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants