Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase to v2.48.1 #5411

Merged
merged 409 commits into from
Feb 13, 2025
Merged

Conversation

dscho
Copy link
Member

@dscho dscho commented Feb 10, 2025

I finally have enough confidence in the current state (especially after #5376).

dscho and others added 30 commits February 6, 2025 19:32
A long time ago, we decided to run tests in Git for Windows' SDK with
the default `winsymlinks` mode: copying instead of linking. This is
still the default mode of MSYS2 to this day.

However, this is not how most users run Git for Windows: As the majority
of Git for Windows' users seem to be on Windows 10 and newer, likely
having enabled Developer Mode (which allows creating symbolic links
without administrator privileges), they will run with symlink support
enabled.

This is the reason why it is crucial to get the fixes for CVE-2024-? to
the users, and also why it is crucial to ensure that the test suite
exercises the related test cases. This commit ensures the latter.

Signed-off-by: Johannes Schindelin <[email protected]>
The pack_name_hash() method has not been materially changed since it was
introduced in ce0bd64 (pack-objects: improve path grouping
heuristics., 2006-06-05). The intention here is to group objects by path
name, but also attempt to group similar file types together by making
the most-significant digits of the hash be focused on the final
characters.

Here's the crux of the implementation:

	/*
	 * This effectively just creates a sortable number from the
	 * last sixteen non-whitespace characters. Last characters
	 * count "most", so things that end in ".c" sort together.
	 */
	while ((c = *name++) != 0) {
		if (isspace(c))
			continue;
		hash = (hash >> 2) + (c << 24);
	}

As the comment mentions, this only cares about the last sixteen
non-whitespace characters. This cause some filenames to collide more
than others. Here are some examples that I've seen while investigating
repositories that are growing more than they should be:

 * "/CHANGELOG.json" is 15 characters, and is created by the beachball
   [1] tool. Only the final character of the parent directory can
   differntiate different versions of this file, but also only the two
   most-significant digits. If that character is a letter, then this is
   always a collision. Similar issues occur with the similar
   "/CHANGELOG.md" path, though there is more opportunity for
   differences in the parent directory.

 * Localization files frequently have common filenames but differentiate
   via parent directories. In C#, the name "/strings.resx.lcl" is used
   for these localization files and they will all collide in name-hash.

[1] https://github.com/microsoft/beachball

I've come across many other examples where some internal tool uses a
common name across multiple directories and is causing Git to repack
poorly due to name-hash collisions.

It is clear that the existing name-hash algorithm is optimized for
repositories with short path names, but also is optimized for packing a
single snapshot of a repository, not a repository with many versions of
the same file. In my testing, this has proven out where the name-hash
algorithm does a good job of finding peer files as delta bases when
unable to use a historical version of that exact file.

However, for repositories that have many versions of most files and
directories, it is more important that the objects that appear at the
same path are grouped together.

Create a new pack_full_name_hash() method and a new --full-name-hash
option for 'git pack-objects' to call that method instead. Add a simple
pass-through for 'git repack --full-name-hash' for additional testing in
the context of a full repack, where I expect this will be most
effective.

The hash algorithm is as simple as possible to be reasonably effective:
for each character of the path string, add a multiple of that character
and a large prime number (chosen arbitrarily, but intended to be large
relative to the size of a uint32_t). Then, shift the current hash value
to the right by 5, with overlap. The addition and shift parameters are
standard mechanisms for creating hard-to-predict behaviors in the bits
of the resulting hash.

This is not meant to be cryptographic at all, but uniformly distributed
across the possible hash values. This creates a hash that appears
pseudorandom. There is no ability to consider similar file types as
being close to each other.

In a later change, a test-tool will be added so the effectiveness of
this hash can be demonstrated directly.

For now, let's consider how effective this mechanism is when repacking a
repository with and without the --full-name-hash option. Specifically,
let's use 'git repack -adf [--full-name-hash]' as our test.

On the Git repository, we do not expect much difference. All path names
are short. This is backed by our results:

| Stage                 | Pack Size | Repack Time |
|-----------------------|-----------|-------------|
| After clone           | 260 MB    | N/A         |
| Standard Repack       | 127MB     | 106s        |
| With --full-name-hash | 126 MB    | 99s         |

This example demonstrates how there is some natural overhead coming from
the cloned copy because the server is hosting many forks and has not
optimized for exactly this set of reachable objects. But the full repack
has similar characteristics with and without --full-name-hash.

However, we can test this in a repository that uses one of the
problematic naming conventions above. The fluentui [2] repo uses
beachball to generate CHANGELOG.json and CHANGELOG.md files, and these
files have very poor delta characteristics when comparing against
versions across parent directories.

| Stage                 | Pack Size | Repack Time |
|-----------------------|-----------|-------------|
| After clone           | 694 MB    | N/A         |
| Standard Repack       | 438 MB    | 728s        |
| With --full-name-hash | 168 MB    | 142s        |

[2] https://github.com/microsoft/fluentui

In this example, we see significant gains in the compressed packfile
size as well as the time taken to compute the packfile.

Using a collection of repositories that use the beachball tool, I was
able to make similar comparisions with dramatic results. While the
fluentui repo is public, the others are private so cannot be shared for
reproduction. The results are so significant that I find it important to
share here:

| Repo     | Standard Repack | With --full-name-hash |
|----------|-----------------|-----------------------|
| fluentui |         438 MB  |               168 MB  |
| Repo B   |       6,255 MB  |               829 MB  |
| Repo C   |      37,737 MB  |             7,125 MB  |
| Repo D   |     130,049 MB  |             6,190 MB  |

Future changes could include making --full-name-hash implied by a config
value or even implied by default during a full repack.

Signed-off-by: Derrick Stolee <[email protected]>
The new '--full-name-hash' option for 'git repack' is a simple
pass-through to the underlying 'git pack-objects' subcommand. However,
this subcommand may have other options and a temporary filename as part
of the subcommand execution that may not be predictable or could change
over time.

The existing test_subcommand method requires an exact list of arguments
for the subcommand. This is too rigid for our needs here, so create a
new method, test_subcommand_flex. Use it to check that the
--full-name-hash option is passing through.

Signed-off-by: Derrick Stolee <[email protected]>
Add a new environment variable to opt-in to the --full-name-hash option
in 'git pack-objects'. This allows for extra testing of the feature
without repeating all of the test scenarios.

But this option isn't free. There are a few tests that change behavior
with the variable enabled.

First, there are a few tests that are very sensitive to certain delta
bases being picked. These are both involving the generation of thin
bundles and then counting their objects via 'git index-pack --fix-thin'
which pulls the delta base into the new packfile. For these tests,
disable the option as a decent long-term option.

Second, there are two tests in t5616-partial-clone.sh that I believe are
actually broken scenarios. While the client is set up to clone the
'promisor-server' repo via a treeless partial clone filter (tree:0),
that filter does not translate to the 'server' repo. Thus, fetching from
these repos causes the server to think that the client has all reachable
trees and blobs from the commits advertised as 'haves'. This leads the
server to providing a thin pack assuming those objects as delta bases.
Changing the name-hash algorithm presents new delta bases and thus
breaks the expectations of these tests. An alternative could be to set
up 'server' as a promisor server with the correct filter enabled. This
may also point out more issues with partial clone being set up as a
remote-based filtering mechanism and not a repository-wide setting. For
now, do the minimal change to make the test work by disabling the test
variable.

Signed-off-by: Derrick Stolee <[email protected]>
The `__MINGW64__` constant is defined, surprise, surprise, only when
building for a 64-bit CPU architecture.

Therefore using it as a guard to define `_POSIX_C_SOURCE` (so that
`localtime_r()` is declared, among other functions) is not enough, we
also need to check `__MINGW32__`.

Technically, the latter constant is defined even for 64-bit builds. But
let's make things a bit easier to understand by testing for both
constants.

Making it so fixes this compile warning (turned error in GCC v14.1):

  archive-zip.c: In function 'dos_time':
  archive-zip.c:612:9: error: implicit declaration of function 'localtime_r';
  did you mean 'localtime_s'? [-Wimplicit-function-declaration]
    612 |         localtime_r(&time, &tm);
        |         ^~~~~~~~~~~
        |         localtime_s

Signed-off-by: Johannes Schindelin <[email protected]>
In order to be a better Windows citizenship, Git should
save its configuration files on AppData folder. This can
enables git configuration files be replicated between machines
using the same Microsoft account logon which would reduce the
friction of setting up Git on new systems. Therefore, if
%APPDATA%\Git\config exists, we use it; otherwise
$HOME/.config/git/config is used.

Signed-off-by: Ariel Lourenco <[email protected]>
Git LFS is now built with Go 1.21 which no longer supports Windows 7.
However, Git for Windows still wants to support Windows 7.

Ideally, Git LFS would re-introduce Windows 7 support until Git for
Windows drops support for Windows 7, but that's not going to happen:
git-for-windows#4996 (comment)

The next best thing we can do is to let the users know what is
happening, and how to get out of their fix, at least.

This is not quite as easy as it would first seem because programs
compiled with Go 1.21 or newer will simply throw an exception and fail
with an Access Violation on Windows 7.

The only way I found to address this is to replicate the logic from Go's
very own `version` command (which can determine the Go version with
which a given executable was built) to detect the situation, and in that
case offer a helpful error message.

This addresses git-for-windows#4996.

Signed-off-by: Johannes Schindelin <[email protected]>
As reported in https://lore.kernel.org/git/[email protected]/,
libcurl v8.10.0 had a regression that was picked up by Git's t5559.30
"large fetch-pack requests can be sent using chunked encoding".

This bug was fixed in libcurl v8.10.1.

Sadly, the macos-13 runner image was updated in the brief window between
these two libcurl versions, breaking each and every CI build, as
reported at git-for-windows#5159.

This would usually not matter, we would just ignore the failing CI
builds until the macos-13 runner image is rebuilt in a couple of days,
and then the CI builds would succeed again.

However.

As has become the custom, a surprise Git version was released, and now
that Git for Windows wants to follow suit, since Git for Windows has
this custom of trying to never release a version with a failing CI
build, we _must_ work around it.

This patch implements this work-around, basically for the sake of Git
for Windows v2.46.2's CI build.

Signed-off-by: Johannes Schindelin <[email protected]>
In anticipation of a few planned applications, introduce the most basic form
of a path-walk API. It currently assumes that there are no UNINTERESTING
objects, and does not include any complicated filters. It calls a function
pointer on groups of tree and blob objects as grouped by path. This only
includes objects the first time they are discovered, so an object that
appears at multiple paths will not be included in two batches.

There are many future adaptations that could be made, but they are left for
future updates when consumers are ready to take advantage of those features.

Signed-off-by: Derrick Stolee <[email protected]>
This also adds the '--full-name-hash' option introduced in the previous
change and adds newlines to the synopsis.

Signed-off-by: Derrick Stolee <[email protected]>
Add some tests based on the current behavior, doing interesting checks
for different sets of branches, ranges, and the --boundary option. This
sets a baseline for the behavior and we can extend it as new options are
introduced.

Signed-off-by: Derrick Stolee <[email protected]>
As custom options are added to 'git pack-objects' and 'git repack' to
adjust how compression is done, use this new performance test script to
demonstrate their effectiveness in performance and size.

The recently-added --full-name-hash option swaps the default name-hash
algorithm with one that attempts to uniformly distribute the hashes
based on the full path name instead of the last 16 characters.

This has a dramatic effect on full repacks for repositories with many
versions of most paths. It can have a negative impact on cases such as
pushing a single change.

This can be seen by running pt5313 on the open source fluentui
repository [1]. Most commits will have this kind of output for the thin
and big pack cases, though certain commits (such as [2]) will have
problematic thin pack size for other reasons.

[1] https://github.com/microsoft/fluentui
[2] a637a06df05360ce5ff21420803f64608226a875

Checked out at the parent of [2], I see the following statistics:

Test                                           this tree
------------------------------------------------------------------
5313.2: thin pack                              0.02(0.01+0.01)
5313.3: thin pack size                                    1.1K
5313.4: thin pack with --full-name-hash        0.02(0.01+0.00)
5313.5: thin pack size with --full-name-hash              3.0K
5313.6: big pack                               1.65(3.35+0.24)
5313.7: big pack size                                    58.0M
5313.8: big pack with --full-name-hash         1.53(2.52+0.18)
5313.9: big pack size with --full-name-hash              57.6M
5313.10: repack                                176.52(706.60+3.53)
5313.11: repack size                                    446.7K
5313.12: repack with --full-name-hash          37.47(134.18+3.06)
5313.13: repack size with --full-name-hash              183.1K

Note that this demonstrates a 3x size _increase_ in the case that
simulates a small "git push". The size change is neutral on the case of
pushing the difference between HEAD and HEAD~1000.

However, the full repack case is both faster and more efficient.

Signed-off-by: Derrick Stolee <[email protected]>
We add the ability to filter the object types in the path-walk API so
the callback function is called fewer times.

This adds the ability to ask for the commits in a list, as well. Future
changes will add the ability to visit annotated tags.

Signed-off-by: Derrick Stolee <[email protected]>
Add a new test-tool helper, name-hash, to output the value of the
name-hash algorithms for the input list of strings, one per line.

Since the name-hash values can be stored in the .bitmap files, it is
important that these hash functions do not change across Git versions.
Add a simple test to t5310-pack-bitmaps.sh to provide some testing of
the current values. Due to how these functions are implemented, it would
be difficult to change them without disturbing these values.

Create a performance test that uses test_size to demonstrate how
collisions occur for these hash algorithms. This test helps inform
someone as to the behavior of the name-hash algorithms for their repo
based on the paths at HEAD.

My copy of the Git repository shows modest statistics around the
collisions of the default name-hash algorithm:

Test                                              this tree
-----------------------------------------------------------------
5314.1: paths at head                                        4.5K
5314.2: number of distinct name-hashes                       4.1K
5314.3: number of distinct full-name-hashes                  4.5K
5314.4: maximum multiplicity of name-hashes                    13
5314.5: maximum multiplicity of fullname-hashes                 1

Here, the maximum collision multiplicity is 13, but around 10% of paths
have a collision with another path.

In a more interesting example, the microsoft/fluentui [1] repo had these
statistics at time of committing:

Test                                              this tree
-----------------------------------------------------------------
5314.1: paths at head                                       19.6K
5314.2: number of distinct name-hashes                       8.2K
5314.3: number of distinct full-name-hashes                 19.6K
5314.4: maximum multiplicity of name-hashes                   279
5314.5: maximum multiplicity of fullname-hashes                 1

[1] https://github.com/microsoft/fluentui

That demonstrates that of the nearly twenty thousand path names, they
are assigned around eight thousand distinct values. 279 paths are
assigned to a single value, leading the packing algorithm to sort
objects from those paths together, by size.

In this repository, no collisions occur for the full-name-hash
algorithm.

In a more extreme example, an internal monorepo had a much worse
collision rate:

Test                                              this tree
-----------------------------------------------------------------
5314.1: paths at head                                      221.6K
5314.2: number of distinct name-hashes                      72.0K
5314.3: number of distinct full-name-hashes                221.6K
5314.4: maximum multiplicity of name-hashes                 14.4K
5314.5: maximum multiplicity of fullname-hashes                 2

Even in this repository with many more paths at HEAD, the collision rate
was low and the maximum number of paths being grouped into a single
bucket by the full-path-name algorithm was two.

Signed-off-by: Derrick Stolee <[email protected]>
In anticipation of using the path-walk API to analyze tags or include
them in a pack-file, add the ability to walk the tags that were included
in the revision walk.

Signed-off-by: Derrick Stolee <[email protected]>
This option is still under discussion on the Git mailing list.

We still would like to have some real-world data, and the best way to
get it is to get a Git for Windows release into users' hands so that
they can test it.

Nevertheless, without the official blessing of the Git maintainer, this
optionis experimental, and we need to be clear about that.

Signed-off-by: Johannes Schindelin <[email protected]>
The sparse tree walk algorithm was created in d5d2e93 (revision:
implement sparse algorithm, 2019-01-16) and involves using the
mark_trees_uninteresting_sparse() method. This method takes a repository
and an oidset of tree IDs, some of which have the UNINTERESTING flag and
some of which do not.

Create a method that has an equivalent set of preconditions but uses a
"dense" walk (recursively visits all reachable trees, as long as they
have not previously been marked UNINTERESTING). This is an important
difference from mark_tree_uninteresting(), which short-circuits if the
given tree has the UNINTERESTING flag.

A use of this method will be added in a later change, with a condition
set whether the sparse or dense approach should be used.

Signed-off-by: Derrick Stolee <[email protected]>
This option causes the path-walk API to act like the sparse tree-walk
algorithm implemented by mark_trees_uninteresting_sparse() in
list-objects.c.

Starting from the commits marked as UNINTERESTING, their root trees and
all objects reachable from those trees are UNINTERSTING, at least as we
walk path-by-path. When we reach a path where all objects associated
with that path are marked UNINTERESTING, then do no continue walking the
children of that path.

We need to be careful to pass the UNINTERESTING flag in a deep way on
the UNINTERESTING objects before we start the path-walk, or else the
depth-first search for the path-walk API may accidentally report some
objects as interesting.

Signed-off-by: Derrick Stolee <[email protected]>
This will be helpful in a future change.

Signed-off-by: Derrick Stolee <[email protected]>
In order to more easily compute delta bases among objects that appear at the
exact same path, add a --path-walk option to 'git pack-objects'.

This option will use the path-walk API instead of the object walk given by
the revision machinery. Since objects will be provided in batches
representing a common path, those objects can be tested for delta bases
immediately instead of waiting for a sort of the full object list by
name-hash. This has multiple benefits, including avoiding collisions by
name-hash.

The objects marked as UNINTERESTING are included in these batches, so we
are guaranteeing some locality to find good delta bases.

After the individual passes are done on a per-path basis, the default
name-hash is used to find other opportunistic delta bases that did not
match exactly by the full path name.

RFC TODO: It is important to note that this option is inherently
incompatible with using a bitmap index. This walk probably also does not
work with other advanced features, such as delta islands.

Getting ahead of myself, this option compares well with --full-name-hash
when the packfile is large enough, but also performs at least as well as
the default in all cases that I've seen.

RFC TODO: this should probably be recording the batch locations to another
list so they could be processed in a second phase using threads.

RFC TODO: list some examples of how this outperforms previous pack-objects
strategies. (This is coming in later commits that include performance
test changes.)

Signed-off-by: Derrick Stolee <[email protected]>
There are many tests that validate whether 'git pack-objects' works as
expected. Instead of duplicating these tests, add a new test environment
variable, GIT_TEST_PACK_PATH_WALK, that implies --path-walk by default
when specified.

This was useful in testing the implementation of the --path-walk
implementation, especially in conjunction with test such as:

 - t0411-clone-from-partial.sh : One test fetches from a repo that does
   not have the boundary objects. This causes the path-based walk to
   fail. Disable the variable for this test.

 - t5306-pack-nobase.sh : Similar to t0411, one test fetches from a repo
   without a boundary object.

 - t5310-pack-bitmaps.sh : One test compares the case when packing with
   bitmaps to the case when packing without them. Since we disable the
   test variable when writing bitmaps, this causes a difference in the
   object list (the --path-walk option adds an extra object). Specify
   --no-path-walk in both processes for the comparison. Another test
   checks for a specific delta base, but when computing dynamically
   without using bitmaps, the base object it too small to be considered
   in the delta calculations so no base is used.

 - t5316-pack-delta-depth.sh : This script cares about certain delta
   choices and their chain lengths. The --path-walk option changes how
   these chains are selected, and thus changes the results of this test.

 - t5322-pack-objects-sparse.sh : This demonstrates the effectiveness of
   the --sparse option and how it combines with --path-walk.

 - t5332-multi-pack-reuse.sh : This test verifies that the preferred
   pack is used for delta reuse when possible. The --path-walk option is
   not currently aware of the preferred pack at all, so finds a
   different delta base.

 - t7406-submodule-update.sh : When using the variable, the --depth
   option collides with the --path-walk feature, resulting in a warning
   message. Disable the variable so this warning does not appear.

I want to call out one specific test change that is only temporary:

 - t5530-upload-pack-error.sh : One test cares specifically about an
   "unable to read" error message. Since the current implementation
   performs delta calculations within the path-walk API callback, a
   different "unable to get size" error message appears. When this
   is changed in a future refactoring, this test change can be reverted.

Signed-off-by: Derrick Stolee <[email protected]>
Since 'git pack-objects' supports a --path-walk option, allow passing it
through in 'git repack'. This presents interesting testing opportunities for
comparing the different repacking strategies against each other.

Add the --path-walk option to the performance tests in p5313.

For the microsoft/fluentui repo [1] checked out at a specific commit [2],
the results are very interesting:

Test                                           this tree
------------------------------------------------------------------
5313.2: thin pack                              0.40(0.47+0.04)
5313.3: thin pack size                                    1.2M
5313.4: thin pack with --full-name-hash        0.09(0.10+0.04)
5313.5: thin pack size with --full-name-hash             22.8K
5313.6: thin pack with --path-walk             0.08(0.06+0.02)
5313.7: thin pack size with --path-walk                  20.8K
5313.8: big pack                               2.16(8.43+0.23)
5313.9: big pack size                                    17.7M
5313.10: big pack with --full-name-hash        1.42(3.06+0.21)
5313.11: big pack size with --full-name-hash             18.0M
5313.12: big pack with --path-walk             2.21(8.39+0.24)
5313.13: big pack size with --path-walk                  17.8M
5313.14: repack                                98.05(662.37+2.64)
5313.15: repack size                                    449.1K
5313.16: repack with --full-name-hash          33.95(129.44+2.63)
5313.17: repack size with --full-name-hash              182.9K
5313.18: repack with --path-walk               106.21(121.58+0.82)
5313.19: repack size with --path-walk                   159.6K

[1] https://github.com/microsoft/fluentui
[2] e70848ebac1cd720875bccaa3026f4a9ed700e08

This repo suffers from having a lot of paths that collide in the name
hash, so examining them in groups by path leads to better deltas. Also,
in this case, the single-threaded implementation is competitive with the
full repack. This is saving time diffing files that have significant
differences from each other.

A similar, but private, repo has even more extremes in the thin packs:

Test                                           this tree
--------------------------------------------------------------
5313.2: thin pack                              2.39(2.91+0.10)
5313.3: thin pack size                                    4.5M
5313.4: thin pack with --full-name-hash        0.29(0.47+0.12)
5313.5: thin pack size with --full-name-hash             15.5K
5313.6: thin pack with --path-walk             0.35(0.31+0.04)
5313.7: thin pack size with --path-walk                  14.2K

Notice, however, that while the --full-name-hash version is working
quite well in these cases for the thin pack, it does poorly for some
other standard cases, such as this test on the Linux kernel repository:

Test                                           this tree
--------------------------------------------------------------
5313.2: thin pack                              0.01(0.00+0.00)
5313.3: thin pack size                                     310
5313.4: thin pack with --full-name-hash        0.00(0.00+0.00)
5313.5: thin pack size with --full-name-hash              1.4K
5313.6: thin pack with --path-walk             0.00(0.00+0.00)
5313.7: thin pack size with --path-walk                    310

Here, the --full-name-hash option does much worse than the default name
hash, but the path-walk option does exactly as well.

Signed-off-by: Derrick Stolee <[email protected]>
Users may want to enable the --path-walk option for 'git pack-objects' by
default, especially underneath commands like 'git push' or 'git repack'.

This should be limited to client repositories, since the --path-walk option
disables bitmap walks, so would be bad to include in Git servers when
serving fetches and clones. There is potential that it may be helpful to
consider when repacking the repository, to take advantage of improved deltas
across historical versions of the same files.

Much like how "pack.useSparse" was introduced and included in
"feature.experimental" before being enabled by default, use the repository
settings infrastructure to make the new "pack.usePathWalk" config enabled by
"feature.experimental" and "feature.manyFiles".

Signed-off-by: Derrick Stolee <[email protected]>
Repositories registered with Scalar are expected to be client-only
repositories that are rather large. This means that they are more likely to
be good candidates for using the --path-walk option when running 'git
pack-objects', especially under the hood of 'git push'. Enable this config
in Scalar repositories.

Signed-off-by: Derrick Stolee <[email protected]>
Previously, the --path-walk option to 'git pack-objects' would compute
deltas inline with the path-walk logic. This would make the progress
indicator look like it is taking a long time to enumerate objects, and
then very quickly computed deltas.

Instead of computing deltas on each region of objects organized by tree,
store a list of regions corresponding to these groups. These can later
be pulled from the list for delta compression before doing the "global"
delta search.

This presents a new progress indicator that can be used in tests to
verify that this stage is happening.

The current implementation is not integrated with threads, but could be
done in a future update.

Since we do not attempt to sort objects by size until after exploring
all trees, we can remove the previous change to t5530 due to a different
error message appearing first.

Signed-off-by: Derrick Stolee <[email protected]>
Adapting the implementation of ll_find_deltas(), create a threaded
version of the --path-walk compression step in 'git pack-objects'.

This involves adding a 'regions' member to the thread_params struct,
allowing each thread to own a section of paths. We can simplify the way
jobs are split because there is no value in extending the batch based on
name-hash the way sections of the object entry array are attempted to be
grouped. We re-use the 'list_size' and 'remaining' items for the purpose
of borrowing work in progress from other "victim" threads when a thread
has finished its batch of work more quickly.

Using the Git repository as a test repo, the p5313 performance test
shows that the resulting size of the repo is the same, but the threaded
implementation gives gains of varying degrees depending on the number of
objects being packed. (This was tested on a 16-core machine.)

Test                                    HEAD~1    HEAD
-------------------------------------------------------------
5313.6: thin pack with --path-walk        0.01    0.01  +0.0%
5313.7: thin pack size with --path-walk    475     475  +0.0%
5313.12: big pack with --path-walk        1.99    1.87  -6.0%
5313.13: big pack size with --path-walk  14.4M   14.3M  -0.4%
5313.18: repack with --path-walk         98.14   41.46 -57.8%
5313.19: repack size with --path-walk   197.2M  197.3M  +0.0%

Signed-off-by: Derrick Stolee <[email protected]>
In anticipation of implementing 'git backfill', populate the necessary files
with the boilerplate of a new builtin.

RFC TODO: When preparing this for a full implementation, make sure it is
based on the newest standards introduced by [1].

[1] https://lore.kernel.org/git/[email protected]/T/#m606036ea2e75a6d6819d6b5c90e729643b0ff7f7
    [PATCH 1/3] builtin: add a repository parameter for builtin functions

Signed-off-by: Derrick Stolee <[email protected]>
The default behavior of 'git backfill' is to fetch all missing blobs that
are reachable from HEAD. Document and test this behavior.

The implementation is a very simple use of the path-walk API, initializing
the revision walk at HEAD to start the path-walk from all commits reachable
from HEAD. Ignore the object arrays that correspond to tree entries,
assuming that they are all present already.

Signed-off-by: Derrick Stolee <[email protected]>
Users may want to specify a minimum batch size for their needs. This is only
a minimum: the path-walk API provides a list of OIDs that correspond to the
same path, and thus it is optimal to allow delta compression across those
objects in a single server request.

We could consider limiting the request to have a maximum batch size in the
future.

Signed-off-by: Derrick Stolee <[email protected]>
One way to significantly reduce the cost of a Git clone and later fetches is
to use a blobless partial clone and combine that with a sparse-checkout that
reduces the paths that need to be populated in the working directory. Not
only does this reduce the cost of clones and fetches, the sparse-checkout
reduces the number of objects needed to download from a promisor remote.

However, history investigations can be expensie as computing blob diffs will
trigger promisor remote requests for one object at a time. This can be
avoided by downloading the blobs needed for the given sparse-checkout using
'git backfill' and its new '--sparse' mode, at a time that the user is
willing to pay that extra cost.

Note that this is distinctly different from the '--filter=sparse:<oid>'
option, as this assumes that the partial clone has all reachable trees and
we are using client-side logic to avoid downloading blobs outside of the
sparse-checkout cone. This avoids the server-side cost of walking trees
while also achieving a similar goal. It also downloads in batches based on
similar path names, presenting a resumable download if things are
interrupted.

This augments the path-walk API to have a possibly-NULL 'pl' member that may
point to a 'struct pattern_list'. This could be more general than the
sparse-checkout definition at HEAD, but 'git backfill --sparse' is currently
the only consumer.

Be sure to test this in both cone mode and not cone mode. Cone mode has the
benefit that the path-walk can skip certain paths once they would expand
beyond the sparse-checkout.

Signed-off-by: Derrick Stolee <[email protected]>
dscho and others added 13 commits February 6, 2025 19:33
Signed-off-by: Johannes Schindelin <[email protected]>
This was pull request git-for-windows#1645 from ZCube/master

Support windows container.

Signed-off-by: Johannes Schindelin <[email protected]>
…ws#4527)

With this patch, Git for Windows works as intended on mounted APFS
volumes (where renaming read-only files would fail).

Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
This patch introduces support to set special NTFS attributes that are
interpreted by the Windows Subsystem for Linux as file mode bits, UID
and GID.

Signed-off-by: Johannes Schindelin <[email protected]>
Handle Ctrl+C in Git Bash nicely

Signed-off-by: Johannes Schindelin <[email protected]>
A fix for calling `vim` in Windows Terminal caused a regression and was
reverted. We partially un-revert this, to get the fix again.

Signed-off-by: Johannes Schindelin <[email protected]>
This topic branch re-adds the deprecated --stdin/-z options to `git
reset`. Those patches were overridden by a different set of options in
the upstream Git project before we could propose `--stdin`.

We offered this in MinGit to applications that wanted a safer way to
pass lots of pathspecs to Git, and these applications will need to be
adjusted.

Instead of `--stdin`, `--pathspec-from-file=-` should be used, and
instead of `-z`, `--pathspec-file-nul`.

Signed-off-by: Johannes Schindelin <[email protected]>
Originally introduced as `core.useBuiltinFSMonitor` in Git for Windows
and developed, improved and stabilized there, the built-in FSMonitor
only made it into upstream Git (after unnecessarily long hemming and
hawing and throwing overly perfectionist style review sticks into the
spokes) as `core.fsmonitor = true`.

In Git for Windows, with this topic branch, we re-introduce the
now-obsolete config setting, with warnings suggesting to existing users
how to switch to the new config setting, with the intention to
ultimately drop the patch at some stage.

Signed-off-by: Johannes Schindelin <[email protected]>
…updates

Start monitoring updates of Git for Windows' component in the open
Add a README.md for GitHub goodness.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho added this to the Next release milestone Feb 10, 2025
@dscho dscho requested review from mjcheetham and rimrul February 10, 2025 08:10
@dscho dscho self-assigned this Feb 10, 2025
@dscho
Copy link
Member Author

dscho commented Feb 10, 2025

Range-diff relative to main
  • 1: 501d8da < -: ----------- credential_format(): also encode [:]
  • 2: db58126 < -: ----------- credential: sanitize the user prompt
  • 3: 429023c < -: ----------- credential: disallow Carriage Returns in the protocol by default
  • 5: 20dfd7e = 1: 17ae787 sideband: mask control characters
  • 6: e6a6b9d = 2: 976299b sideband: introduce an "escape hatch" to allow control characters
  • 7: 656fe4e = 3: 65896a3 sideband: do allow ANSI color sequences by default
  • 4: ee1479b = 4: 63cab6a unix-socket: avoid leak when initialization fails
  • 8: acbcc27 = 5: 495ab70 test-lib: invert return value of check_test_results_san_file_empty
  • 9: 5996c9f = 6: 75dde95 test-lib: simplify lsan results check
  • 19: 9804f3a = 7: ec65acf test-lib: add a few comments to LSan log checking
  • 10: 6b442e6 = 8: ab2a205 bswap.h: squelch potential sparse -Wcast-truncate warnings
  • 11: 77568b2 = 9: 6abb17c object-file: fix race in object collision check
  • 12: 0e3ae59 = 10: 2dbdf2c packfile: factor out --pack_header argument parsing
  • 13: db47f3b = 11: 5e1d9aa object-file: rename variables in check_collision()
  • 14: 9992be6 = 12: 5285cdc parse_pack_header_option(): avoid unaligned memory writes
  • 15: c712ec9 = 13: cec3ebb object-file: don't special-case missing source file in collision check
  • 16: 2fe489d = 14: 16be54a index-pack, unpack-objects: use get_be32() for reading pack header
  • 20: c7eb1e3 = 15: a6aca32 object-file: retry linking file into place when occluding file vanishes
  • 21: 15af087 = 16: dae2c5e index-pack, unpack-objects: use skip_prefix to avoid magic number
  • 18: 5e1c7b1 = 17: 8ba42f1 reftable: write correct max_update_index to header
  • 23: f8daf7c = 18: 62a5707 refs: mark ref_transaction_update_reflog() as static
  • 17: c9bd22e = 19: 5a88867 object-name: fix resolution of object names containing curly braces
  • 25: f1ae60f = 20: 80d74ed refs: use 'uint64_t' for 'ref_update.index'
  • 22: 1b7dea1 = 21: 5aeb1e4 object-name: be more strict in parsing describe-like output
  • 28: 8a95fbc = 22: 2b08618 reftable: prevent 'update_index' changes after adding records
  • 24: ffcfbe7 = 23: 8849ac4 ref-filter: move ahead-behind bases into used_atom
  • 27: 64055b3 = 24: 2fbabb9 ref-filter: move is-base tip to used_atom
  • 30: 3e1a53d = 25: 135a432 ref-filter: remove ref_format_clear()
  • 31: 355849e = 26: c4c6f21 fetch set_head: fix non-mirror remotes in bare repositories
  • 32: 2bb96e4 = 27: 3e20693 show-index: the short help should say the command reads from its input
  • 33: 34a7cf4 = 28: f40199f credential-cache: respect authtype capability
  • 26: b710fe8 = 29: 299435c trace2: prevent segfault on config collection where no value specified
  • 29: 5b3ed30 = 30: a2e7060 grep: prevent ^$ false match at end of file
  • 34: 4573f9d = 31: 51354d1 update-ref: do set reflog's old_oid
  • 35: ca94787 = 32: df6fb8e ci: stop linking the prove cache
  • 36: 909223b = 33: a8339b0 contrib/buildsystems: drop support for building .vcproj/.vcxproj files
  • 37: caad529 = 34: e74c948 config.mak.uname: drop the vcxproj target
  • 38: d4b07ef = 35: f9b4cc2 ci: adjust Azure Pipeline for runs_on_pool
  • 43: ae895f5 = 36: 5bedadd gitk(Windows): avoid inadvertently calling executables in the worktree
  • 39: 591376e = 37: aa7251e t9350: point out that refs are not updated correctly
  • 40: 90e30d0 = 38: 479f8b6 transport-helper: add trailing --
  • 41: a338859 = 39: d0dea2b remote-helper: check helper status after import/export
  • 42: 9cbc716 = 40: 40d86da mingw: demonstrate a problem with certain absolute paths
  • 52: 99ffa42 = 41: 63af78b clean: do not traverse mount points
  • 44: 81406ce = 42: e693db1 Always auto-gc after calling a fast-import transport
  • 45: 5e12a91 = 43: b39b0a5 mingw: allow absolute paths without drive prefix
  • 46: 5f71962 = 44: 8526281 mingw: include the Python parts in the build
  • 47: 5c6c3a3 = 45: df7888f win32/pthread: avoid name clashes with winpthread
  • 48: 928fa43 = 46: eea7e46 git-compat-util: avoid redeclaring _DEFAULT_SOURCE
  • 49: bf284f1 = 47: eb1cd4d Import the source code of mimalloc v2.1.2
  • 54: b86e095 = 48: 1b1d970 clean: remove mount points when possible
  • 50: 3d296b0 = 49: 3952c00 mimalloc: adjust for building inside Git
  • 51: b5335a0 = 50: 27712a7 mimalloc: offer a build-time option to enable it
  • 53: 7402d65 = 51: 3bce96c mimalloc: use "weak" random seed when statically linked
  • 55: 1245671 = 52: fc7f8f8 mingw: use mimalloc
  • 56: 4007ccb = 53: d59c78d transport: optionally disable side-band-64k
  • 63: 02e80d9 = 54: 87aca61 mingw: do resolve symlinks in getcwd()
  • 57: dd5817d = 55: 317e4b4 mingw: ensure valid CTYPE
  • 58: 3111521 = 56: 78d0bec mingw: demonstrate a git add issue with NTFS junctions
  • 59: 59dbe7e = 57: 2b2dd03 mingw: allow git.exe to be used instead of the "Git wrapper"
  • 60: 318939f = 58: 84c2c79 strbuf_realpath(): use platform-dependent API if available
  • 61: da2a9df = 59: d704de7 mingw: ignore HOMEDRIVE/HOMEPATH if it points to Windows' system directory
  • 64: 5b56c83 = 60: ca6841c mingw: fix fatal error working on mapped network drives on Windows
  • 65: 54449ed = 61: 2e17892 clink.pl: fix MSVC compile script to handle libcurl-d.lib
  • 66: d1e51aa = 62: da40ce6 mingw: implement a platform-specific strbuf_realpath()
  • 168: 6d862c5 = 63: 9d1485e t5505/t5516: allow running without .git/branches/ in the templates
  • 169: 56b43f0 = 64: 49b452c t5505/t5516: fix white-space around redirectors
  • 62: 60064d9 = 65: 8cc8048 http: use new "best effort" strategy for Secure Channel revoke checking
  • 67: 7abe8d8 = 66: fd3c3fa t3701: verify that we can add lots of files interactively
  • 68: 024f162 = 67: 28d17c2 git add -i: handle CR/LF line endings in the interactive input
  • 69: 28cc597 = 68: 8c1e12c commit: accept "scissors" with CR/LF line endings
  • 70: 73bab9f = 69: 222b837 t0014: fix indentation
  • 71: b96f8cb = 70: 95c5172 git-gui: accommodate for intent-to-add files
  • 170: bab125e = 71: 77f2c6b clink.pl: fix libexpatd.lib link error when using MSVC
  • 171: e4f2a37 = 72: c1ef925 Makefile: clean up .ilk files when MSVC=1
  • 172: 369ff79 = 73: 4d15032 vcbuild: add support for compiling Windows resource files
  • 173: a7d9802 = 74: bbc45a0 config.mak.uname: add git.rc to MSVC builds
  • 174: a2e5383 = 75: f8e8a92 clink.pl: ignore no-stack-protector arg on MSVC=1 builds
  • 175: 4f8ede9 = 76: 0c71605 clink.pl: move default linker options for MSVC=1 builds
  • 176: f750c94 = 77: efcac16 cmake: install headless-git.
  • 72: 07946b3 = 78: 6213778 vcpkg_install: detect lack of Git
  • 73: 08ba0df = 79: 8809b24 vcpkg_install: add comment regarding slow network connections
  • 74: f17f4ef = 80: 460c3f4 vcbuild: install ARM64 dependencies when building ARM64 binaries
  • 75: 691f907 = 81: 3a442eb vcbuild: add an option to install individual 'features'
  • 76: e93200a = 82: 2dfcd30 cmake: allow building for Windows/ARM64
  • 77: 08e1c07 = 83: 1f64d1f ci(vs-build) also build Windows/ARM64 artifacts
  • 78: 8aa481c = 84: d95a8ea Add schannel to curl installation
  • 79: 9e21199 = 85: ccaf175 cmake(): allow setting HOST_CPU for cross-compilation
  • 84: e5f9ad2 = 86: 7358760 mingw: allow for longer paths in parse_interpreter()
  • 85: b4189eb = 87: bdd45c4 compat/vcbuild: document preferred way to build in Visual Studio
  • 86: 7c5bb0f = 88: 3d2196d http: optionally send SSL client certificate
  • 80: a8d96a3 = 89: af3af55 CMake: default Visual Studio generator has changed
  • 81: a472e30 = 90: ee51d5f .gitignore: add Visual Studio CMakeSetting.json file
  • 82: d5ffb93 = 91: 627aa8a subtree: update contrib/subtree test target
  • 83: d24fdbf = 92: cded29a CMakeLists: add default "x64-windows" arch for Visual Studio
  • 87: 3feff71 = 93: acaa42e ci: run contrib/subtree tests in CI builds
  • 88: 00932cf = 94: 75dc17d CMake: show Win32 and Generator_platform build-option values
  • 90: 6a91eff = 95: 93cb699 hash-object: demonstrate a >4GB/LLP64 problem
  • 91: 5bcb106 = 96: 375bfa7 write_object_file_literally(): use size_t
  • 92: c060c2d = 97: afc175f object-file.c: use size_t for header lengths
  • 93: a8879af = 98: 5ce17ed hash algorithms: use size_t for section lengths
  • 94: ccdf745 = 99: 347983c hash-object --stdin: verify that it works with >4GB/LLP64
  • 95: 6707108 = 100: 1c6a355 hash-object: add another >4GB/LLP64 test case
  • 89: 17c4a8b = 101: f036563 init: do parse all core.* settings early
  • 98: 4425c87 = 102: 72c5430 hash-object: add a >4GB/LLP64 test case using filtered input
  • 96: 1377a2c = 103: b1d2fea setup: properly use "%(prefix)/" when in WSL
  • 97: d62d906 = 104: 1864fbc Add config option windows.appendAtomically
  • 99: 2e1d946 = 105: fa5ad7e compat/mingw.c: do not warn when failing to get owner
  • 100: 70c3929 = 106: 144cf12 mingw: $env:TERM="xterm-256color" for newer OSes
  • 101: 6f4bfa5 = 107: 6a242b2 winansi: check result and Buffer before using Name
  • 107: 5a1c50c = 108: 671491d mingw: change core.fsyncObjectFiles = 1 by default
  • 105: b1777bc = 109: 754b5f5 bswap.h: add support for built-in bswap functions
  • 102: 528f46c = 110: 4e87e48 MinGW: link as terminal server aware
  • 108: b5e754e = 111: 01595ac Fix Windows version resources
  • 109: f7887d0 = 112: c611af2 config.mak.uname: add support for clangarm64
  • 110: 002d5a3 = 113: 02c73f4 status: fix for old-style submodules with commondir
  • 111: 8f8f5fd = 114: 3c2cb2f windows: skip linking git-<command> for built-ins
  • 103: 08bd02f = 115: 622a259 http: optionally load libcurl lazily
  • 104: 3336704 = 116: 6ffa94a http: support lazy-loading libcurl also on Windows
  • 106: 0a6fa31 = 117: 7047ac4 http: when loading libcurl lazily, allow for multiple SSL backends
  • 112: 86e62a2 = 118: fd3a261 windows: fix Repository>Explore Working Copy
  • 113: 62ca977 = 119: e40203b mingw: do load libcurl dynamically by default
  • 114: 6792709 = 120: bcdcf4d Add a GitHub workflow to verify that Git/Scalar work in Nano Server
  • 115: e21e749 = 121: 85f3301 mingw: suggest windows.appendAtomically in more cases
  • 116: c2f3767 = 122: 22b0373 win32: use native ANSI sequence processing, if possible
  • 177: 151973c = 123: 39af65a git.rc: include winuser.h
  • 117: f929b83 = 124: 12bc2c6 common-main.c: fflush stdout buffer upon exit
  • 118: 63f9ec0 = 125: 0829b69 t5601/t7406(mingw): do run tests with symlink support
  • 119: 0a0e906 = 126: f5776d0 ci: work around a problem with HTTP/2 vs libcurl v8.10.0
  • 120: 791493f = 127: f2b2cf1 pack-objects: add --full-name-hash option
  • 121: c07ed6f = 128: 3c6b870 repack: test --full-name-hash option
  • 122: ce4b382 = 129: f2f2ce2 pack-objects: add GIT_TEST_FULL_NAME_HASH
  • 129: 246d787 = 130: a87eeab win32: ensure that localtime_r() is declared even in i686 builds
  • 130: 5f3f500 = 131: c9730ea Fallback to AppData if XDG_CONFIG_HOME is unset
  • 131: 6a83fdb = 132: 251ca71 run-command: be helpful with Git LFS fails on Windows 7
  • 123: c15ff02 = 133: 728cf13 git-repack: update usage to match docs
  • 124: 1b2b309 = 134: 48557e9 p5313: add size comparison test
  • 125: da926aa = 135: 6eff58d test-tool: add helper for name-hash values
  • 126: 1e525a9 = 136: a7d1bbd repack/pack-objects: mark --full-name-hash as experimental
  • 127: a7818ea = 137: b8aac65 path-walk: introduce an object walk by path
  • 128: 8bbf649 = 138: 2f90b90 t6601: add helper for testing path-walk API
  • 132: 54f5c77 = 139: 4e3019e path-walk: allow consumer to specify object types
  • 133: fe02e1c = 140: 13d31b4 path-walk: allow visiting tags
  • 134: a472302 = 141: a3e36a1 revision: create mark_trees_uninteresting_dense()
  • 135: 230f831 = 142: 86dbff7 path-walk: add prune_all_uninteresting option
  • 136: af14811 = 143: 812b057 pack-objects: extract should_attempt_deltas()
  • 137: c896542 = 144: 6b5c2b8 pack-objects: add --path-walk option
  • 138: 090ce15 = 145: 4a512b0 pack-objects: introduce GIT_TEST_PACK_PATH_WALK
  • 139: f9db153 = 146: 4332687 repack: add --path-walk option
  • 140: 40d30d4 = 147: 9350e54 pack-objects: enable --path-walk via config
  • 141: 872da37 = 148: a96c73c scalar: enable path-walk during push via config
  • 142: d61dc09 = 149: 3626d3e pack-objects: refactor path-walk delta phase
  • 143: dfe1c28 = 150: 41019de pack-objects: thread the path-based compression
  • 144: f2c9df5 = 151: 250b5d4 path-walk API: avoid adding a root tree more than once
  • 145: a20f34f = 152: 4497a07 backfill: add builtin boilerplate
  • 146: fd8c081 = 153: 70d026e backfill: basic functionality and tests
  • 147: ff48017 = 154: cebd23f backfill: add --batch-size= option
  • 148: 043bddc = 155: 8a25c02 backfill: add --sparse option
  • 149: f03bb33 = 156: d2e24b2 backfill: assume --sparse when sparse-checkout is enabled
  • 150: fcf53bc = 157: 2390b61 backfill: mark it as experimental
  • 151: e6469c2 = 158: 9744533 survey: stub in new experimental 'git-survey' command
  • 152: 51e1146 = 159: bf2a46e survey: add command line opts to select references
  • 153: 2385069 = 160: e35bec6 survey: start pretty printing data in table form
  • 154: f4f9119 = 161: 26894cb survey: add object count summary
  • 155: 55889b1 = 162: 8051e6d survey: summarize total sizes by object type
  • 156: 0e29509 = 163: c44834d survey: show progress during object walk
  • 157: d4521ef = 164: 6da4046 survey: add ability to track prioritized lists
  • 158: c37830b = 165: db1579e survey: add report of "largest" paths
  • 159: 315d7eb = 166: aebcfa8 survey: add --top= option and config
  • 160: 2a8b254 = 167: 7c054b0 mingw: make sure errno is set correctly when socket operations fail
  • 161: f64b9ec = 168: 52f76dc compat/mingw: handle WSA errors in strerror
  • 162: c41c120 = 169: 0d035a7 compat/mingw: drop outdated comment
  • 163: 010312b = 170: 6717843 survey: clearly note the experimental nature in the output
  • 164: ba08b68 = 171: eff2587 t0301: actually test credential-cache on Windows
  • 165: 4c0af9a = 172: cd5965a path-walk: improve path-walk speed with many tags
  • 166: eced1ad = 173: 8ee08bf credential-cache: handle ECONNREFUSED gracefully
  • 167: e35b86e = 174: c4df497 mingw_open_existing: handle directories better
  • 178: fb8be89 = 175: 12580ca git-gui: provide question helper for retry fallback on Windows
  • 179: ba06011 = 176: 68306f7 git gui: set GIT_ASKPASS=git-gui--askpass if not set yet
  • 182: 64c9ec5 = 177: a4fadd2 gitk: Unicode file name support
  • 180: edea57c = 178: 0762dff git-gui--askyesno: fix funny text wrapping
  • 183: c56c670 = 179: 81d53b3 gitk: Use an external icon file on Windows
  • 181: 81b289c = 180: 950b7cf git-gui--askyesno: allow overriding the window title
  • 184: cf75911 = 181: f94f9c8 gitk: fix arrow keys in input fields with Tcl/Tk >= 8.6
  • 185: 2dbda5e = 182: 148264a git-gui--askyesno (mingw): use Git for Windows' icon, if available
  • 186: 25691cf = 183: d23b686 gitk: make the "list references" default window width wider
  • 187: 1af9931 = 184: cc76fdd Win32: make FILETIME conversion functions public
  • 188: 05a1e42 = 185: 24c8f28 Win32: dirent.c: Move opendir down
  • 189: c3fb594 = 186: 4449938 mingw: make the dirent implementation pluggable
  • 190: 5228a5a = 187: 4f2ebf7 Win32: make the lstat implementation pluggable
  • 191: 2d40e26 = 188: 8c66243 mingw: add infrastructure for read-only file system level caches
  • 192: f5bdcd6 = 189: 30cac28 mingw: add a cache below mingw's lstat and dirent implementations
  • 193: 896234e = 190: 3a776d0 fscache: load directories only once
  • 194: fb6acb5 = 191: d5161fe fscache: add key for GIT_TRACE_FSCACHE
  • 195: cead8ed = 192: 4260904 fscache: remember not-found directories
  • 196: 82d49ea = 193: bea2594 fscache: add a test for the dir-not-found optimization
  • 197: d1e5640 = 194: 63d1d53 add: use preload-index and fscache for performance
  • 198: 6ff1f4a = 195: aa0805a dir.c: make add_excludes aware of fscache during status
  • 199: ea168fa = 196: 73cdb83 fscache: make fscache_enabled() public
  • 200: 40a2c9c = 197: 625bdd5 dir.c: regression fix for add_excludes with fscache
  • 201: 8cb06a0 = 198: e19c972 fetch-pack.c: enable fscache for stats under .git/objects
  • 202: a3e60fc = 199: bdc5178 checkout.c: enable fscache for checkout again
  • 203: 7fec933 = 200: bf647ba Enable the filesystem cache (fscache) in refresh_index().
  • 204: 7e2ba4c = 201: 0b2caca fscache: use FindFirstFileExW to avoid retrieving the short name
  • 205: e25cb52 = 202: 6c48958 fscache: add GIT_TEST_FSCACHE support
  • 206: 2d5e882 = 203: 3487aef fscache: add fscache hit statistics
  • 207: 407ff9c = 204: 03eb3a2 unpack-trees: enable fscache for sparse-checkout
  • 208: 69869e4 = 205: 4c2a010 status: disable and free fscache at the end of the status command
  • 209: 72e48c0 = 206: 6f28abe mem_pool: add GIT_TRACE_MEMPOOL support
  • 210: 4b0d5ac = 207: a755ea6 fscache: fscache takes an initial size
  • 211: 5dd9d75 = 208: dba12a8 fscache: update fscache to be thread specific instead of global
  • 212: 1796c56 = 209: f2863c0 fscache: teach fscache to use mempool
  • 213: f6c33bb = 210: b67da53 fscache: make fscache_enable() thread safe
  • 214: e7c4a5e = 211: 5eb43ec fscache: teach fscache to use NtQueryDirectoryFile
  • 215: eaf317c = 212: 62e0392 fscache: remember the reparse tag for each entry
  • 216: 835c030 = 213: 2fed2bd fscache: implement an FSCache-aware is_mount_point()
  • 217: 8549752 = 214: e94f1e0 clean: make use of FSCache
  • 218: 75370fe = 215: f02bd51 pack-objects (mingw): demonstrate a segmentation fault with large deltas
  • 219: abeb314 = 216: 1bd9bd5 mingw: support long paths
  • 220: c630875 = 217: 3b1a192 Win32: fix 'lstat("dir/")' with long paths
  • 221: a77aff7 = 218: ef27b39 win32(long path support): leave drive-less absolute paths intact
  • 222: 9662251 = 219: ff2c288 compat/fsmonitor/fsm-*-win32: support long paths
  • 223: 5f4ad29 = 220: f5ddc17 clean: suggest using core.longPaths if paths are too long to remove
  • 224: f10eef6 = 221: b57894a mingw: Support git_terminal_prompt with more terminals
  • 225: dd24530 = 222: 64f6c72 compat/terminal.c: only use the Windows console if bash 'read -r' fails
  • 226: 3d70745 = 223: 245d20a mingw (git_terminal_prompt): do fall back to CONIN$/CONOUT$ method
  • 227: 2f57147 = 224: 4eff09c strbuf_readlink: don't call readlink twice if hint is the exact link size
  • 228: e7dc2a9 = 225: 1e3c782 strbuf_readlink: support link targets that exceed PATH_MAX
  • 229: 12f5b49 = 226: 664cdf0 lockfile.c: use is_dir_sep() instead of hardcoded '/' checks
  • 230: 6a58f12 = 227: c552ccd Win32: don't call GetFileAttributes twice in mingw_lstat()
  • 231: 14010b3 = 228: afe2828 Win32: implement stat() with symlink support
  • 232: 9c44fef = 229: 6129d20 Win32: remove separate do_lstat() function
  • 233: 20280fc = 230: ac4d9d0 Win32: let mingw_lstat() error early upon problems with reparse points
  • 234: 6c68e87 = 231: fce9988 mingw: teach fscache and dirent about symlinks
  • 235: 451ce8b = 232: f322681 Win32: lstat(): return adequate stat.st_size for symlinks
  • 236: a0fc905 = 233: 7af4c9f Win32: factor out retry logic
  • 237: 7425615 = 234: d1ff8c1 Win32: change default of 'core.symlinks' to false
  • 238: aa45a28 = 235: 8acb180 Win32: add symlink-specific error codes
  • 239: 261adda = 236: e6cff8b Win32: mingw_unlink: support symlinks to directories
  • 240: 9271297 = 237: ea84594 Win32: mingw_rename: support renaming symlinks
  • 241: b6ed5ef = 238: e23c903 Win32: mingw_chdir: change to symlink-resolved directory
  • 242: 4c62c07 = 239: 9029be3 Win32: implement readlink()
  • 243: ed85d85 = 240: 3b2f9e7 mingw: lstat: compute correct size for symlinks
  • 244: d08749f = 241: 5e8cc14 Win32: implement basic symlink() functionality (file symlinks only)
  • 245: 105c8bd = 242: 514775f Win32: symlink: add support for symlinks to directories
  • 246: 8178825 = 243: 6c085c3 mingw: try to create symlinks without elevated permissions
  • 247: ba8af20 = 244: b40add0 mingw: emulate stat() a little more faithfully
  • 248: 3a956d4 = 245: 30bf673 mingw: special-case index entries for symlinks with buggy size
  • 249: 5cc5918 = 246: 179e314 mingw: introduce code to detect whether we're inside a Windows container
  • 250: 38a0688 = 247: 72bc74a mingw: when running in a Windows container, try to rename() harder
  • 251: d365524 = 248: b49ffe2 mingw: move the file_attr_to_st_mode() function definition
  • 252: 7108f3d = 249: 569793d mingw: Windows Docker volumes are not symbolic links
  • 253: dacdc89 = 250: 0e3fcf1 Win32: symlink: move phantom symlink creation to a separate function
  • 254: d54a5e5 = 251: 252c80e mingw: work around rename() failing on a read-only file
  • 255: ce11b80 = 252: c5910b3 Introduce helper to create symlinks that knows about index_state
  • 256: 51df38b = 253: 89091af mingw: allow to specify the symlink type in .gitattributes
  • 257: 3bc5590 = 254: a4beccd Win32: symlink: add test for symlink attribute
  • 258: 72be863 = 255: 8d73280 mingw: explicitly specify with which cmd to prefix the cmdline
  • 259: ea5db30 = 256: dcc37df mingw: when path_lookup() failed, try BusyBox
  • 260: 85f59a8 = 257: 8ed82c2 test-lib: avoid unnecessary Perl invocation
  • 261: b4de07f = 258: 6d0fcf9 test-tool: learn to act as a drop-in replacement for iconv
  • 262: 8dc54c4 = 259: 27a8014 tests(mingw): if iconv is unavailable, use test-helper --iconv
  • 263: 3bc06dd = 260: e7ffe9d gitattributes: mark .png files as binary
  • 264: 90a1d88 = 261: e947f64 tests: move test PNGs into t/lib-diff/
  • 265: 0f23a77 = 262: 8bf35a1 tests: only override sort & find if there are usable ones in /usr/bin/
  • 266: e05a988 = 263: 854ecfa tests: use the correct path separator with BusyBox
  • 267: e692539 = 264: e1c2b75 mingw: only use Bash-ism builtin pwd -W when available
  • 268: 8cced00 = 265: fc344a5 tests (mingw): remove Bash-specific pwd option
  • 269: 3f278c9 = 266: 45b0a20 test-lib: add BUSYBOX prerequisite
  • 270: f215ec6 = 267: 46632be t5003: use binary file from t/lib-diff/
  • 271: 4dcd754 = 268: 9916ddb t5532: workaround for BusyBox on Windows
  • 272: bb9fed5 = 269: 8572e61 t5605: special-case hardlink test for BusyBox-w32
  • 273: 8373342 = 270: 6776e2b t5813: allow for $PWD to be a Windows path
  • 274: 170bd46 = 271: 400718b t9200: skip tests when $PWD contains a colon
  • 275: 446eaca = 272: 27a08a9 mingw: add a Makefile target to copy test artifacts
  • 277: 05a213a = 273: 104929a mingw: optionally enable wsl compability file mode bits
  • 276: f135bb7 = 274: 8a1c124 mingw: kill child processes in a gentler way
  • 279: 316d0e3 = 275: 27e7db4 mingw: do not call xutftowcs_path in mingw_mktemp
  • 278: 61710c8 = 276: 0a3b2f8 mingw: really handle SIGINT
  • 280: 6b42f97 = 277: dfea28f Partially un-revert "editor: save and reset terminal after calling EDITOR"
  • 281: f9ed972 = 278: b0303bc reset: reinstate support for the deprecated --stdin option
  • 283: a8f263c = 279: eca0f8c Describe Git for Windows' architecture [no ci]
  • 284: 28bc9ec = 280: 1cf1df4 Modify the Code of Conduct for Git for Windows
  • 285: 6da2bef = 281: 0d30207 CONTRIBUTING.md: add guide for first-time contributors
  • 288: c78bc7d = 282: b7e1428 Add a GitHub workflow to monitor component updates
  • 286: c716e76 = 283: d8f3a6a README.md: Add a Windows-specific preamble
  • 282: 68a7fda = 284: 1e8d48d fsmonitor: reintroduce core.useBuiltinFSMonitor
  • 290: ff3e380 = 285: a556355 dependabot: help keeping GitHub Actions versions up to date
  • 287: 7ee0b52 = 286: 5f99cfa Add an issue template
  • 289: cdeff1c = 287: 4b262bd Modify the GitHub Pull Request template (to reflect Git for Windows)
  • 291: f9e4f82 = 288: 2363214 SECURITY.md: document Git for Windows' policies

This part is expected:

  • 1: 501d8da < -: ----------- credential_format(): also encode [:]
  • 2: db58126 < -: ----------- credential: sanitize the user prompt
  • 3: 429023c < -: ----------- credential: disallow Carriage Returns in the protocol by default

It is expected because Git for Windows had to tag v2.47.1.windows.2 about four weeks before the Git project was ready to tag v2.47.2 (using identical fixes, though, apart from the fix for CVE-2024-52005).

@dscho dscho linked an issue Feb 10, 2025 that may be closed by this pull request
@dscho
Copy link
Member Author

dscho commented Feb 10, 2025

Heads-up: on hold because of an impending cURL bug-fix release. New tentative release date: Feb 13, 2025.

@dscho
Copy link
Member Author

dscho commented Feb 13, 2025

/git-artifacts

The tag-git workflow run was started

The git-artifacts-x86_64 workflow run was started.
The git-artifacts-i686 workflow run was started.
The git-artifacts-aarch64 workflow run was started.

@dscho
Copy link
Member Author

dscho commented Feb 13, 2025

I validated the x86_64 installer manually. Now we're only waiting for the Windows/ARM64 artifacts.

@dscho
Copy link
Member Author

dscho commented Feb 13, 2025

/release

The release-git workflow run was started

@gitforwindowshelper gitforwindowshelper bot merged commit 2bd190b into git-for-windows:main Feb 13, 2025
71 of 72 checks passed
@dscho dscho deleted the rebase-to-v2.48.1 branch February 13, 2025 12:51
dscho added a commit to git-for-windows/gfw-helper-github-app that referenced this pull request Feb 13, 2025
The logic was _exactly_ inverted. What it _did_ want to verify was that
the commit for which the snapshot was built is reachable from `main`.
What it verified instead was that `main`'s tip commit is reachable from
said commit. 🤦

I noticed this only today, when a successful `git-artifacts` run in the
v2.48.1 PR at git-for-windows/git#5411 tried to
upload a snapshot, and the (correct) ahead/behind by logic in the
`upload-artifacts` workflow failed (but then succeeded when I re-ran the
workflow after releasing Git for Windows v2.48.1).

Let's invert the logic so that it does what it is supposed to do.

Signed-off-by: Johannes Schindelin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[New git version] v2.48.1