Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maintenance: add new cache-local-objects maintenance task #720

Merged
merged 4 commits into from
Jan 31, 2025

Conversation

mjcheetham
Copy link
Member

@mjcheetham mjcheetham commented Jan 21, 2025

Introduce a new maintenance task, cache-local-objects, that operates on Scalar or VFS for Git repositories with a per-volume, shared object cache (specified by gvfs.sharedCache) to migrate packfiles and loose objects from the repository object directory to the shared cache.

Older versions of microsoft/git incorrectly placed packfiles in the repository object directory instead of the shared cache; this task will help clean up existing clones impacted by that issue.

Fixes #716

@mjcheetham mjcheetham force-pushed the scalar-cache-task branch 2 times, most recently from 1c29b72 to 4a5255d Compare January 21, 2025 09:41
@mjcheetham mjcheetham force-pushed the scalar-cache-task branch 2 times, most recently from cb383ec to 904f61a Compare January 21, 2025 13:47
@mjcheetham mjcheetham changed the base branch from vfs-2.47.1 to vfs-2.47.2 January 22, 2025 14:14
@mjcheetham mjcheetham force-pushed the scalar-cache-task branch 2 times, most recently from 9c30bd3 to e3d64ab Compare January 22, 2025 15:07
@mjcheetham mjcheetham marked this pull request as ready for review January 22, 2025 15:08
@mjcheetham mjcheetham force-pushed the scalar-cache-task branch 2 times, most recently from 66a83df to 6981c37 Compare January 23, 2025 11:06
@mjcheetham
Copy link
Member Author

@derrickstolee @dscho I've addressed the issues raised in the comments so far. Please could I have another look over? Thanks! :)

builtin/gc.c Outdated
Comment on lines 1434 to 1435
static void move_pack_to_vfs_cache(const char *full_path, size_t full_path_len,
const char *file_name, UNUSED void *data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to merge migrate_pack() directly into this function, after the .pack suffix check.

@mjcheetham mjcheetham force-pushed the scalar-cache-task branch 2 times, most recently from 097250d to dd3cb57 Compare January 28, 2025 16:46
dscho
dscho previously approved these changes Jan 30, 2025
Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Tests in t7900 assume the state of the `maintenance.strategy`
config setting; set/unset by previous tests. Correct this by
explictly unsetting and re-setting the config at the start of the
tests.

Signed-off-by: Matthew John Cheetham <[email protected]>
Introduce a new maintenance task, `cache-local-objects`, that operates
on Scalar or VFS for Git repositories with a per-volume, shared object
cache (specified by `gvfs.sharedCache`) to migrate packfiles and loose
objects from the repository object directory to the shared cache.

Older versions of `microsoft/git` incorrectly placed packfiles in the
repository object directory instead of the shared cache; this task will
help clean up existing clones impacted by that issue.

Migration of packfiles involves the following steps for each pack:

1. Hardlink (or copy):
   a. the .pack file
   b. the .keep file
   c. the .rev file
2. Move (or copy + delete) the .idx file
3. Delete/unlink:
   a. the .pack file
   b. the .keep file
   c. the .rev file

Moving the index file after the others ensures the pack is not read
from the new cache directory until all associated files (rev, keep)
exist in the cache directory also.

Moving loose objects operates as a move, or copy + delete.

Signed-off-by: Matthew John Cheetham <[email protected]>
Add the `cache-local-objects` maintenance task to the list of tasks run
by the `scalar run` command. It's often easier for users to run the
shorter `scalar run` command than the equivalent `git maintenance`
command.

Signed-off-by: Matthew John Cheetham <[email protected]>
@mjcheetham mjcheetham changed the title maintenance: add new vfs-cache-move maintenance task maintenance: add new cache-local-objects maintenance task Jan 30, 2025
Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mjcheetham! I tested this with my local Office enlistment and it worked great to move both loose objects and packfiles. I'm excited for this to help our customers!

@mjcheetham mjcheetham merged commit b5b340d into microsoft:vfs-2.47.2 Jan 31, 2025
49 checks passed
@mjcheetham mjcheetham deleted the scalar-cache-task branch January 31, 2025 14:05
dscho pushed a commit that referenced this pull request Feb 5, 2025
Introduce a new maintenance task, `cache-local-objects`, that operates
on Scalar or VFS for Git repositories with a per-volume, shared object
cache (specified by `gvfs.sharedCache`) to migrate packfiles and loose
objects from the repository object directory to the shared cache.

Older versions of `microsoft/git` incorrectly placed packfiles in the
repository object directory instead of the shared cache; this task will
help clean up existing clones impacted by that issue.

Fixes #716
dscho pushed a commit that referenced this pull request Feb 10, 2025
Introduce a new maintenance task, `cache-local-objects`, that operates
on Scalar or VFS for Git repositories with a per-volume, shared object
cache (specified by `gvfs.sharedCache`) to migrate packfiles and loose
objects from the repository object directory to the shared cache.

Older versions of `microsoft/git` incorrectly placed packfiles in the
repository object directory instead of the shared cache; this task will
help clean up existing clones impacted by that issue.

Fixes #716
@dscho dscho mentioned this pull request Feb 10, 2025
dscho pushed a commit that referenced this pull request Feb 27, 2025
Introduce a new maintenance task, `cache-local-objects`, that operates
on Scalar or VFS for Git repositories with a per-volume, shared object
cache (specified by `gvfs.sharedCache`) to migrate packfiles and loose
objects from the repository object directory to the shared cache.

Older versions of `microsoft/git` incorrectly placed packfiles in the
repository object directory instead of the shared cache; this task will
help clean up existing clones impacted by that issue.

Fixes #716
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scalar: Move local .git objects to scalar cache for efficiency and behavior breaks
3 participants