Delayed process filters suppress +x in gix clone
#1783
Labels
acknowledged
an issue is accepted as shortcoming to be fixed
gix clone
#1783
Current behavior 😯
gix clone
checks out a file that should have executable permissions but fails to set those permissions, when all of the following are true:.gitattributes
file that applies an attribute to the file that should be executable.gix clone
, such as the command scope with-c
, that enables a smudge filter for that attribute.can-delay=1
"packet" to the calling process.Each of those elements is required, to trigger the bug. If the attribute is not applied to a particular file (even if it is applied to other files and the smudge filter is configured), or if the smudge filter is not configured (regardless of attributes), or if the smudge filter is not a process filter, or if the smudge filter is a process filter but it does not support delays, then the file that should be executable is still checked out as executable.
Relationship to CVE-2025-22620
This is conceptually related to GHSA-fqmf-w4xh-33rh (RUSTSEC-2025-0001), and I found it while researching that. But this is a separate bug. I do not believe this bug is a security vulnerability. Quite the opposite: this bug is the main reason
gix clone
itself appears not to be affected by GHSA-fqmf-w4xh-33rh even whengitoxide
is built against a vulnerable version ofgix-worktree-state
.I've confirmed that this bug occurs both before and after the fix for GHSA-fqmf-w4xh-33rh in #1764, by testing with both gitoxide 0.40.0 (using a build from before gix-worktree-state 0.17.0 was released) and gitoxide 0.41.0. I would also not expect the changes in #1764 to have fixed or otherwise affected this, because that causes
finalize_entry
to increase permissions less than they were increased before (i.e., chmod is called equally or, in the TOCTOU case, slightly less often, and the bits that are flipped from 0 to 1 are a subset of those that were before).What seems to be happening
When a long running smudge filter is used for a file during checkout, and it allows delays,
gix-worktree-state
defers setting+x
permissions, planning to set them after the file has been rewritten by the filter, rather than before the filter operates. Butfinalize_entry
does not always set+x
.Specifically, it looks like, if a file is checked out with
destination_is_initially_empty: true
in thecheckout::Options
but a process smudge filter with delays applies to that file, setting+x
is deferred because a filter is to be used, but then not applied, due todestination_is_initially_empty
. However, I am not certain that it always prevents that from happening, in part because all of my testing on thisgix-worktree-state
bug has been withgix clone
(other than withgit clone
to compare).Concrete example
If a repository tracks files
a
,b
, andc
, wherea
andb
have an attribute that subjects them to a long-running smudge filter with delays enabled, and whereb
andc
are tracked as executable, then the only filegix clone
checks out as executable isc
. After checkout:See the "Expected behavior" section for a comparison, and the "Steps to reproduce" section for the detailed testing methodology.
Expected behavior 🤔
All files, even those to which a long running smudge filter with delays is applied, should be checked out with executable permissions corresponding to what the repository tracks (provided that the operating system and filesystem support Unix-style executable permissions, and no other configuration option prevents them).
Of the concrete example
In the above example, both
b
andc
should end up executable so the file modes on disk are consistent with respect the presence or absence of+x
with their modes in the tree object, and we should see:Design considerations
I think it is not immediately clear what the best way is to bring about this change.
If
destination_is_initially_empty: true
always produces this effect in a checkout with delayed process filters (including in comparable checkouts performed when callinggix_worktree_state::checkout
from an application other than the gitoxidegix
executable), then one possible view of this bug is that it is a combination of:destination_is_initially_empty
is not made clear.gitoxide-core
, where checkouts usedestination_is_initially_empty
even when long-running smudge filters with delays enabled apply to at least one file.However, I find that view unsatisfying, for five reasons, where each except (1) points the way to another possible approach:
The full performance and semantics (including collision detection) of an exclusive checkout, which is brought about by
destination_is_initially_empty: true
, are valued forgix clone
.An option called
destination_is_initially_empty
should, if possible, do the right thing when the destination is initially empty. The top-levelcheckout
function is the only public one, and it takes abool
fordestination_is_initially_empty
in its options, which it passes down to a helper that checks out an individual file.But if
destination_is_initially_empty
needs to be set differently for different files depending on what filters are set up, then that option should accept more than two values, or no longer be passed directly to the helper, or both. This would go well beyond updating documentation ingix-worktree-state
and modifyinggitoxide-core
. I don't favor this, because it seems like a design change that is broader than (3) below and likely more complicated, while not addressing the considerations of (4) or (5).The current intended behavior of long running smudge filters with delays enabled--as I understand it--is to cause
finalize_entry
to run after the filter is known to be done with the file, even whendestination_is_initially_empty: true
. That is, it looks like long running smudge filters with delays are the one intended exception todestination_is_initially_empty: true
eagerly callingfinalize_entry
.If so, making that happen could be a fairly narrowly scoped fix for this bug. This would require no change to
gitoxide-core
and no major change to the documentation, though minor refinement of the documentation could accompany it. Further improvements related to points (4) and (5) below could come later. This would have exacerbated GHSA-fqmf-w4xh-33rh, but now that it is fixed, I am not aware of any strong reason to avoid this approach.Do long running smudge filters with delays actually require waiting to set executable permissions? There are three reasons I can think of to wait, but none seem very strong:
To avoid marking files executable before they are smudged. For example, to avoid marking stubs for Git LFS executable, when only the file they stand for would actually be reasonable to execute. The problem with this is that creating stubs without executable permissions and adding them later would only address that in the case that the filter is actually enabled. If it is not enabled in configuration variables, then the tracked
+x
is applied eagerly. So this would only be keeping+x
away during checkouts, and not in the also-common case that the filter is not applied at all.In case the effect of the smudge filter is ever to replace the file with a new one rather than writing to the existing file. But if I understand the recommendations in gitattributes(5) for filters, smudge and clean filters are not supposed to do that kind of thing, and it is not guaranteed to work if they do. The filter reads from stdin and writes to stdout:
The filter can decide what to do based on the path but should not read from it (and this seems like it applies to writing as well):
The filter should provide the updated contents on stdout, and this is the case even for process filters, which use the packetline protocol to do so:
With delays, the path is more important, since that is how a specific blob is requested, but the filter still obtains its contents and provides its updated contents in a way that does not involve reading and writing that path:
It also seems to me that, if a filter does replace the file, then it should be responsible for doing so with the correct mode, and that its ability to do so would be enhanced rather than diminished by setting the intended permissions before the filter operates rather than after.
In case the filter changes the permissions to something wrong. But that is just a broken filter.
More broadly, if a filter does change the mode, or even if the filter replaces the file, I am not sure we should try to countermand that even partially, since it could be attempting to smudge in filesystem metadata that Git cannot track but that is supported on the system. We could try to add
+x
on top of that, but it might be hard to do safely, since adding+x
is not always safe.For example, as noted in the "Impact" section of GHSA-fqmf-w4xh-33rh, setuid and setgid bits were among those cleared when setting 777 permissions. Clearing those bits when
+x
is important, especially for setgid bits, which have a different meaning on some systems when no executable bits are set, pertaining to mandatory file locking (so they may be set without intending to cause anyone to be allowed to run the file with a changed effective group ID). To avoid setting 777 permissions except in the very rare case it is warranted, the fix for GHSA-fqmf-w4xh-33rh in #1764 does examine the existing mode and preserve parts of it, but it takes care still to clear those bits.That specific case of concern is covered by the existing logic, but others may not be. If a smudge filter changes file type, mode, or access through a mechanism other than mode-based permissions--such as by a mechanism
setfacl
orchattr
--it seems to me that that's more of a reason to have performed any of our ownchmod
operations earlier, rather than later.(#1764 also adds logic to avoid
+x
in a TOCTOU typechange, which provides some protection against this in the case of replacing the file with an entry of a different type. But this is probably a slight further reason to set+x
earlier so that, if the file is replaced with one of a different type that meaningfully supports+x
permissions, they can be preserved.)Does any scenario ever actually require waiting to set executable permissions? Currently we always do this unless
destination_is_initially_empty: true
. The reason seems to be to avoid changing permissions of a file that, if an error or cancellation occurs, may not have its contents replaced. But:If that is to be avoided, then developers using
gix-worktree-state
should be cautioned in documentation about the current effect of keeping bothdestination_is_initially_empty
andoverwrite_existing
at their default values offalse
in a checkout. This does not write to the file, but does then set its permissions. (This is mentioned in GHSA-fqmf-w4xh-33rh as relevant to impact, but it is not part of that vulnerability and #1764 does not change it.)It seems like this could be avoided by treating a mode change more like a type change with respect to how a file is checked out, and instead of reusing the existing file, deleting (unlinking) it and replacing it with a new one, or doing so unless their contents are the same.
Examining
git checkout --force
withstrace
, it looks like it always does this, even if only metadata and not contents are changed. It looks like Git either primarily or exclusively modifies files in a force checkout by replacing them. It seems like that may solve various problems if done here, too. (Specific known cases where it makes sense to reuse a file could still be admitted, for example if the type and mode are the same and the link count is 1.)If this is done as the primary strategy, with existing files not reused or only reused as an optimization in simple situations with no mode changes, then that would also eliminate the bit-manipulation from #1764 and instead always do something that is automatically correct because permissions are always set through
open
calls that automatically respect the umask. This would also arguably behave more the way a user wants, in the case that the user sets an unusual umask in which execute bits are masked differently from their corresponding read bits.One aspect of the separation of writing to a file and setting its permissions is that, for some reason, we are setting the permissions using the path, even though we have an open file descriptor.
gitoxide/gix-worktree-state/src/checkout/entry.rs
Lines 285 to 292 in 8df0db2
We have a
File
object, from which we can presumably perform any operation that is valid on an open file descriptor. The immediately subsequent code is:gitoxide/gix-worktree-state/src/checkout/entry.rs
Lines 293 to 296 in 8df0db2
File permissions on a Unix-like system can be set through an open file descriptor, without using the path. (If an OS is not Unix-like then we are not setting
+x
anyway.) POSIX standardizesfchmod
for this purpose.I believe
gix_index::fs
, used there, already depends onrustix
wherecfg!(unix)
. One way to callfchmod
is throughrustix::fs::fchmod
. (It does not support file descriptors opened with theO_PATH
flag, but I believe that is inapplicable here, because that's only useful when the file contents are not to be accessed in any way, whereas in contrast the file descriptor associated with ourFile
object represents a file that is open in the full traditional sense and that has, presumably, just been written to.)Besides the possibility that this would mitigate some remaining TOCTOU behavior if changed within the
finalize_entry
function itself, I wonder if this would make it more acceptable to set the executable permissions earlier iffchmod
were used at the time the file is first opened instead. Of course, if we just replace the file in all but the most trivial situations, as described at the end of (ii), then this is inapplicable because we would never need to do anything likechmod
orfchmod
.Git behavior
In the example with
a
,b
, andc
, withgit clone
, executable permissions are set on bothb
andc
, rather than being omitted forb
:This is the expected behavior. It also matches the behavior of
gix clone
when the preconditions for the bug, listed at the top of the "Current behavior" section above, do not apply.As noted above in "Expected behavior" in 5(ii), Git makes greater use of deleting and replacing files.
Steps to reproduce 🕹
I used the
arrow.rs
filter example ingix-filter
, invoked as a long-running filter, as the smudge filter for producing the bug ingix clone
as well as for verifying that it does not occur ingit clone
. The test setup can be obtained from thisfilter-experiment
repository. The readme there includes instructions for running the experiment after cloning it, as well as more details on some aspects of the experiment.However, I think it valuable for this issue to be self-contained. Furthermore, the script itself is descriptively valuable for clarifying the circumstances under which the bug occurs, even if one does not run it.
Setup
If not cloning
filter-experiment
, first create a directory alongside (i.e. as a sibling of) thegitoxide
repository working tree. Enter the new directory and create an executable filerun-experiment
with the contents:Between separate runs of that script,
rm -rf remote-repo local-clone
should be run to remove the repositories it creates and uses for the test. (It will not proceed if they are already present.)Results
Running
./run-experiment git
testsgit clone
, showing that bothb
andc
are marked executable. This demonstrates the Git behavior, and is the expected behavior in the absence of a bug.Running
./run-experiment gix
testsgix clone
, showing that onlyc
and notb
is marked executable. This demonstrates the current behavior.I did these tests on Arch Linux, using QuickInstall builds of:
gitoxide
0.40.0, withgix-worktree-state
0.16.0, from before GHSA-fqmf-w4xh-33rh was fixed, andgitoxide
0.41.0, withgix-worktree-state
0.17.0, which has the #1764 fix for GHSA-fqmf-w4xh-33rh.gix clone
showed the same results with both. (As detailed above in "Current behavior," under "Relationship to CVE-2025-22620," I would not expect them to be different.)It doesn't happen without delays
The
arrow.rs
filter enables delays, and reportscan-delay
to the caller, unlessdisallow-delay
is passed. To confirm that a long running smudge filter without delays will not produce the bug, I redid the experiment with this change to therun-experiment
script:The bug no longer occurred then.
gix clone
produced the same result asgit clone
.It does happen with
required
As described above, the
filter.<driver>.required
was not used, even though, as described in #1781, thegix-filter
tests that usearrow.rs
do setrequired
totrue
. This was originally because I wasn't familiar withrequired
, even though both Git and gitoxide recognize it.To confirm that this does not change the result of the experiment, I also reran the experiment with this change to the
run-experiment
script (though the effect could also be achieved by passing-c filter.arrow-example.required=true
as arguments to./run-experiment
):"$gitlike_cmd" "$@" -c filter.arrow-example.process="$filter_cmd" \ + -c filter.arrow-example.required=true \ clone "file://$(realpath remote-repo)" local-clone
As expected, this does not make a difference. To be clear, that does not prove it never makes a difference, since I don't think the filter ever actually failed. (That is, this does not explore what happens when a smudge filter fails, it only shows that neither
required
nor its absence are preconditions for the bug.)Does the "schedule" matter?
The experiment uses few files: four, including
.gitattributes
; but only two to which the filter is applied. Delay behavior might differ with more files, or other different conditions. I suspect that does not make a difference, but I don't know it doesn't, and the experiment here doesn't speak to that.How much do
gitoxide-core
specifics matter?Because this is only testing gitoxide through
gix clone
commands, it does not distinguish behavior that is due to the details of how code ingitoxide-core
callsgix_worktree_state::checkout
from behavior that is due to code ingix_worktree_state
itself.The text was updated successfully, but these errors were encountered: