Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ns/batched fsync fix long path #3494

Conversation

neerajsi-msft
Copy link

Thanks for taking the time to contribute to Git!

Those seeking to contribute to the Git for Windows fork should see
http://gitforwindows.org/#contribute on how to contribute Windows specific
enhancements.

If your contribution is for the core Git functions and documentation
please be aware that the Git community does not use the github.com issues
or pull request mechanism for their contributions.

Instead, we use the Git mailing list ([email protected]) for code and
documentation submissions, code reviews, and bug reports. The
mailing list is plain text only (anything with HTML is sent directly
to the spam folder).

Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
to conveniently send your Pull Requests commits to our mailing list.

Please read the "guidelines for contributing" linked above!

neerajsi-msft and others added 17 commits October 26, 2021 17:34
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it.  The
subprocesses would view it as their primary datastore and write to it.

Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.

For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.

Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.

Based-on-patch-by: Elijah Newren <[email protected]>

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.

Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb. Now both mechanisms set
the disable_ref_updates flag on the odb, which is queried by
the ref_transaction_prepare function.

Note: This change adds an assumption that the state of
the_repository is relevant for any ref transaction that might
be initiated. Unwinding this assumption should be straightforward
by saving the relevant repository to query in the transaction or
the ref_store.

Peff's test case was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden. See the call to notes_cache_put in
fill_textconv where errors are ignored.

Reported-by: Jeff King <[email protected]>

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
* ns/tmp-objdir:
  tmp-objdir: disable ref updates when replacing the primary odb
  tmp-objdir: new API for creating temporary writable databases
Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure.

* Rename 'state' variable to 'bulk_checkin_state', since we will later
  be adding 'bulk_fsync_state'.  This also makes the variable easier to
  find in the debugger, since the name is more unique.

* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
  static variable. Doing this avoids resetting the variable in
  finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
  seem to unintentionally disable the plugging functionality the first
  time a new packfile must be created due to packfile size limits. While
  disabling the plugging state only results in suboptimal behavior for
  the current code, it would be fatal for the bulk-fsync functionality
  later in this patch series.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
When adding many objects to a repo with core.fsyncObjectFiles set to
true, the cost of fsync'ing each object file can become prohibitive.

One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. Fortunately, Windows,
and macOS offer mechanisms to write data from the filesystem page cache
without initiating a hardware flush. Linux has the sync_file_range API,
which issues a pagecache writeback request reliably after version 5.2.

This patch introduces a new 'core.fsyncObjectFiles = batch' option that
batches up hardware flushes. It hooks into the bulk-checkin plugging and
unplugging functionality and takes advantage of tmp-objdir.

When the new mode is enabled do the following for each new object:
1. Create the object in a tmp-objdir.
2. Issue a pagecache writeback request and wait for it to complete.

At the end of the entire transaction when unplugging bulk checkin:
1. Issue an fsync against a dummy file to flush the hardware writeback
   cache, which should by now have processed the tmp-objdir writes.
2. Rename all of the tmp-objdir files to their final names.
3. When updating the index and/or refs, we assume that Git will issue
   another fsync internal to that operation. This is not the case today,
   but may be a good extension to those components.

On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns.

This change also updates the macOS code to trigger a real hardware flush
via fnctl(fd, F_FULLFSYNC) when fsync_or_die is called. Previously, on
macOS there was no guarantee of durability since a simple fsync(2) call
does not flush any hardware caches.

_Performance numbers_:

Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.
	  This number is from a patch later in the series.

Adding 500 files to the repo with 'git add' Times reported in seconds.

core.fsyncObjectFiles | Linux | Mac   | Windows
----------------------|-------|-------|--------
                false | 0.06  |  0.35 | 0.61
                true  | 1.88  | 11.18 | 2.47
                batch | 0.15  |  0.41 | 1.53

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
This commit adds a win32 implementation for fsync_no_flush that is
called git_fsync. The 'NtFlushBuffersFileEx' function being called is
available since Windows 8. If the function is not available, we
return -1 and Git falls back to doing a full fsync.

The operating system is told to flush data only without a hardware
flush primitive. A later full fsync will cause the metadata log
to be flushed and then the disk cache to be flushed on NTFS and
ReFS. Other filesystems will treat this as a full flush operation.

I added a new file here for this system call so as not to conflict with
downstream changes in the git-for-windows repository related to fscache.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.

This change enables bulk-checkin for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
pack functionality and the new bulk-fsync functionality.

There is some risk with this change, since under batch fsync, the object
files will not be available until the update-index is entirely complete.
This usage is unlikely, since any tool invoking update-index and
expecting to see objects would have to synchronize with the update-index
process after passing it a file path.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.

By enabling bulk-checkin when unpacking objects, we can take advantage
of batched fsyncs.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add test cases to exercise batch mode for:
 * 'git add'
 * 'git stash'
 * 'git update-index'
 * 'git unpack-objects'

These tests ensure that the added data winds up in the object database.

In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added due
to it already being in the repo.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add a basic performance test for "git add" and "git stash" of a lot of
new objects with various fsync settings.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
In preparation for integrating the new `core.fsyncObjectFiles = batch`
mode and using it by default, let's revert this commit, then redo it in
a slightly different way.

Signed-off-by: Johannes Schindelin <[email protected]>
Let's revert the commit that turned on that mode by default, in
preparation for merging the code supporting a batch mode instead, and
then making _that_ the new default.

Signed-off-by: Johannes Schindelin <[email protected]>
This merges the topic branch (specifically backported onto v2.33.1 to
allow for integrating into Git for Windows' `main` branch) that strikes
a better balance between safety and speed: rather than `fsync()`ing each
and every loose object file, we now offer to do it in a batch.

This will become the new default in Git for Windows.

Signed-off-by: Johannes Schindelin <[email protected]>
For several years, Git for Windows ran with the `core.fsyncObjectFiles`
mode enabled. This forced the disk to be synchronized after each and
every loose object file was written, adding a substantial layer of
safety (otherwise, a crash of the Operating System might have left files
in place of the correct size, but filled with NUL bytes).

In contrast to what this maintainers' assessment of the performance
implications was (namely, that they should be negligible), it turns out
that in particular when working on large monorepos, it _does_ have a
high impact.

In late 2021, Git learned a batched mode in the meantime, a mode that
would batch those synchronizations, which strikes a better balance
between safety and performance. Let's switch to that mode as the new
default.

Signed-off-by: Johannes Schindelin <[email protected]>
Fix prune code to be able to delete multiple object directories. I
wasn't properly resetting the strbuf with the path.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
When setup_work_tree executes, it redoes setup of the object database
path and various other aspects of the_repository.  This destroys the
temporary object database state.

This commit removes the temporary object database and reapplies it
around the operations in the chdir_notify callback.

Signed-off-by: Neeraj Singh <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
The `xutftowcs_path` function canonicalizes absolute paths using GetFullPathNameW.
This canonicalization may change the length of the string (e.g. getting rid of \.\),
which breaks callers that pass the template string in a strbuf and expect the
length of the string to remain the same.

In my particular case, the tmp-objdir code is passing a strbuf to mkdtemp and is
breaking since the strbuf.len is no longer synchronized with strlen(strbuf.buf).

Signed-off-by: Neeraj K. Singh <[email protected]>
@dscho
Copy link
Member

dscho commented Oct 28, 2021

Continuing in #3492.

@dscho dscho closed this Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants