Ns/batched fsync fix long path #3494

neerajsi-msft · 2021-10-27T21:43:43Z

Thanks for taking the time to contribute to Git!

Those seeking to contribute to the Git for Windows fork should see
http://gitforwindows.org/#contribute on how to contribute Windows specific
enhancements.

If your contribution is for the core Git functions and documentation
please be aware that the Git community does not use the github.com issues
or pull request mechanism for their contributions.

Instead, we use the Git mailing list ([email protected]) for code and
documentation submissions, code reviews, and bug reports. The
mailing list is plain text only (anything with HTML is sent directly
to the spam folder).

Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
to conveniently send your Pull Requests commits to our mailing list.

Please read the "guidelines for contributing" linked above!

The tmp_objdir API provides the ability to create temporary object directories, but was designed with the goal of having subprocesses access these object stores, followed by the main process migrating objects from it to the main object store or just deleting it. The subprocesses would view it as their primary datastore and write to it. Here we add the tmp_objdir_replace_primary_odb function that replaces the current process's writable "main" object directory with the specified one. The previous main object directory is restored in either tmp_objdir_migrate or tmp_objdir_destroy. For the --remerge-diff usecase, add a new `will_destroy` flag in `struct object_database` to mark ephemeral object databases that do not require fsync durability. Add 'git prune' support for removing temporary object databases, and make sure that they have a name starting with tmp_ and containing an operation-specific name. Based-on-patch-by: Elijah Newren <[email protected]> Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

When creating a subprocess with a temporary ODB, we set the GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not to update refs, since the tmp-objdir may go away. Introduce a similar mechanism for in-process temporary ODBs when we call tmp_objdir_replace_primary_odb. Now both mechanisms set the disable_ref_updates flag on the odb, which is queried by the ref_transaction_prepare function. Note: This change adds an assumption that the state of the_repository is relevant for any ref transaction that might be initiated. Unwinding this assumption should be straightforward by saving the relevant repository to query in the transaction or the ref_store. Peff's test case was invoking ref updates via the cachetextconv setting. That particular code silently does nothing when a ref update is forbidden. See the call to notes_cache_put in fill_textconv where errors are ignored. Reported-by: Jeff King <[email protected]> Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

* ns/tmp-objdir: tmp-objdir: disable ref updates when replacing the primary odb tmp-objdir: new API for creating temporary writable databases

Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure. * Rename 'state' variable to 'bulk_checkin_state', since we will later be adding 'bulk_fsync_state'. This also makes the variable easier to find in the debugger, since the name is more unique. * Move the 'plugged' data member of 'bulk_checkin_state' into a separate static variable. Doing this avoids resetting the variable in finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we seem to unintentionally disable the plugging functionality the first time a new packfile must be created due to packfile size limits. While disabling the plugging state only results in suboptimal behavior for the current code, it would be fatal for the bulk-fsync functionality later in this patch series. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

When adding many objects to a repo with core.fsyncObjectFiles set to true, the cost of fsync'ing each object file can become prohibitive. One major source of the cost of fsync is the implied flush of the hardware writeback cache within the disk drive. Fortunately, Windows, and macOS offer mechanisms to write data from the filesystem page cache without initiating a hardware flush. Linux has the sync_file_range API, which issues a pagecache writeback request reliably after version 5.2. This patch introduces a new 'core.fsyncObjectFiles = batch' option that batches up hardware flushes. It hooks into the bulk-checkin plugging and unplugging functionality and takes advantage of tmp-objdir. When the new mode is enabled do the following for each new object: 1. Create the object in a tmp-objdir. 2. Issue a pagecache writeback request and wait for it to complete. At the end of the entire transaction when unplugging bulk checkin: 1. Issue an fsync against a dummy file to flush the hardware writeback cache, which should by now have processed the tmp-objdir writes. 2. Rename all of the tmp-objdir files to their final names. 3. When updating the index and/or refs, we assume that Git will issue another fsync internal to that operation. This is not the case today, but may be a good extension to those components. On a filesystem with a singular journal that is updated during name operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS we would expect the fsync to trigger a journal writeout so that this sequence is enough to ensure that the user's data is durable by the time the git command returns. This change also updates the macOS code to trigger a real hardware flush via fnctl(fd, F_FULLFSYNC) when fsync_or_die is called. Previously, on macOS there was no guarantee of durability since a simple fsync(2) call does not flush any hardware caches. _Performance numbers_: Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD. Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD. Windows - Same host as Linux, a preview version of Windows 11. This number is from a patch later in the series. Adding 500 files to the repo with 'git add' Times reported in seconds. core.fsyncObjectFiles | Linux | Mac | Windows ----------------------|-------|-------|-------- false | 0.06 | 0.35 | 0.61 true | 1.88 | 11.18 | 2.47 batch | 0.15 | 0.41 | 1.53 Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

This commit adds a win32 implementation for fsync_no_flush that is called git_fsync. The 'NtFlushBuffersFileEx' function being called is available since Windows 8. If the function is not available, we return -1 and Git falls back to doing a full fsync. The operating system is told to flush data only without a hardware flush primitive. A later full fsync will cause the metadata log to be flushed and then the disk cache to be flushed on NTFS and ReFS. Other filesystems will treat this as a full flush operation. I added a new file here for this system call so as not to conflict with downstream changes in the git-for-windows repository related to fscache. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The update-index functionality is used internally by 'git stash push' to setup the internal stashed commit. This change enables bulk-checkin for update-index infrastructure to speed up adding new objects to the object database by leveraging the pack functionality and the new bulk-fsync functionality. There is some risk with this change, since under batch fsync, the object files will not be available until the update-index is entirely complete. This usage is unlikely, since any tool invoking update-index and expecting to see objects would have to synchronize with the update-index process after passing it a file path. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling bulk-checkin when unpacking objects, we can take advantage of batched fsyncs. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Add test cases to exercise batch mode for: * 'git add' * 'git stash' * 'git update-index' * 'git unpack-objects' These tests ensure that the added data winds up in the object database. In this change we introduce a new test helper lib-unique-files.sh. The goal of this library is to create a tree of files that have different oids from any other files that may have been created in the current test repo. This helps us avoid missing validation of an object being added due to it already being in the repo. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Add a basic performance test for "git add" and "git stash" of a lot of new objects with various fsync settings. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

In preparation for integrating the new `core.fsyncObjectFiles = batch` mode and using it by default, let's revert this commit, then redo it in a slightly different way. Signed-off-by: Johannes Schindelin <[email protected]>

Let's revert the commit that turned on that mode by default, in preparation for merging the code supporting a batch mode instead, and then making _that_ the new default. Signed-off-by: Johannes Schindelin <[email protected]>

This merges the topic branch (specifically backported onto v2.33.1 to allow for integrating into Git for Windows' `main` branch) that strikes a better balance between safety and speed: rather than `fsync()`ing each and every loose object file, we now offer to do it in a batch. This will become the new default in Git for Windows. Signed-off-by: Johannes Schindelin <[email protected]>

For several years, Git for Windows ran with the `core.fsyncObjectFiles` mode enabled. This forced the disk to be synchronized after each and every loose object file was written, adding a substantial layer of safety (otherwise, a crash of the Operating System might have left files in place of the correct size, but filled with NUL bytes). In contrast to what this maintainers' assessment of the performance implications was (namely, that they should be negligible), it turns out that in particular when working on large monorepos, it _does_ have a high impact. In late 2021, Git learned a batched mode in the meantime, a mode that would batch those synchronizations, which strikes a better balance between safety and performance. Let's switch to that mode as the new default. Signed-off-by: Johannes Schindelin <[email protected]>

Fix prune code to be able to delete multiple object directories. I wasn't properly resetting the strbuf with the path. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Johannes Schindelin <[email protected]>

When setup_work_tree executes, it redoes setup of the object database path and various other aspects of the_repository. This destroys the temporary object database state. This commit removes the temporary object database and reapplies it around the operations in the chdir_notify callback. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Johannes Schindelin <[email protected]>

The `xutftowcs_path` function canonicalizes absolute paths using GetFullPathNameW. This canonicalization may change the length of the string (e.g. getting rid of \.\), which breaks callers that pass the template string in a strbuf and expect the length of the string to remain the same. In my particular case, the tmp-objdir code is passing a strbuf to mkdtemp and is breaking since the strbuf.len is no longer synchronized with strlen(strbuf.buf). Signed-off-by: Neeraj K. Singh <[email protected]>

dscho · 2021-10-28T08:44:52Z

Continuing in #3492.

neerajsi-msft and others added 17 commits October 26, 2021 17:34

Merge branch 'ns/tmp-objdir' into ns/batched-fsync

b183f4d

* ns/tmp-objdir: tmp-objdir: disable ref updates when replacing the primary odb tmp-objdir: new API for creating temporary writable databases

core.fsyncobjectfiles: performance tests for add and stash

9e8a8f0

Add a basic performance test for "git add" and "git stash" of a lot of new objects with various fsync settings. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

fixup! mingw: change core.fsyncObjectFiles = 1 by default

878669f

In preparation for integrating the new `core.fsyncObjectFiles = batch` mode and using it by default, let's revert this commit, then redo it in a slightly different way. Signed-off-by: Johannes Schindelin <[email protected]>

Merge branch 'revert-fsyncObjectFiles-on-by-default'

009844a

Let's revert the commit that turned on that mode by default, in preparation for merging the code supporting a batch mode instead, and then making _that_ the new default. Signed-off-by: Johannes Schindelin <[email protected]>

neerajsi-msft mentioned this pull request Oct 27, 2021

Switch to batched fsync by default #3492

Merged

dscho closed this Oct 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ns/batched fsync fix long path #3494

Ns/batched fsync fix long path #3494

neerajsi-msft commented Oct 27, 2021

dscho commented Oct 28, 2021

Ns/batched fsync fix long path #3494

Ns/batched fsync fix long path #3494

Conversation

neerajsi-msft commented Oct 27, 2021

dscho commented Oct 28, 2021