An old value of MSYS
with unpaired surrogates is silently replaced for fixtures
#1574
Labels
MSYS
with unpaired surrogates is silently replaced for fixtures
#1574
Current behavior 😯
In
gix-testtools
, when test fixture scripts are run, theconfigure_command
function customizes their environment. Since 0899c2e (#1444), this includes setting theMSYS
variable to make it so that commands in shell scripts such asln -s
will really attempt to create symlinks rather than copies.If
MSYS
was already set, the options held in its old value are kept, with the new option appended at the end so it will only suppress an option that conflicts with it, otherwise leaving the effects of preceding options in place.But this does not happen if the old value of
MSYS
contained an unpaired surrogate code point. This is becauseenv::var
rather thanenv::var_os
is used, so theunwrap_or_default
method returns an empty string both whenMSYS
was absent and when it was present with a value that could not be represented as well-formed UTF-8.https://github.com/Byron/gitoxide/blob/9bddeeb948397df3d1e391ec48597cc34d10acd0/tests/tools/src/lib.rs#L596-L597
https://github.com/Byron/gitoxide/blob/9bddeeb948397df3d1e391ec48597cc34d10acd0/tests/tools/src/lib.rs#L605
There are at least three scenarios where
MSYS
may contain an unpaired surrogate:error_start:
contains one. This seems unlikely, except as a special case of...MSYS
contains an unpaired surrogate.The impact of this bug is therefore small. However, fixing it would help clarify the intended semantics of how
gix-testtools
treats theMSYS
environment variable when running test fixture scripts. (This may potentially be relevant to some changes made for #1578.)Expected behavior 🤔
Due to the nature of
gix-testtools
as a testing facility, both as part ofgitoxide
's test suite and when the published crate is used, I think any of these behaviors are fine and should be considered to fix the bug:Always panic when the previous value of
MSYS
contained an unpaired surrogate, whether or not this is documented.Always accept and append to the previous value of
MSYS
when it contains an unpaired surrogate, such as by usingenv::var_os
and callingpush
(rather thanenv::var
and callingpush_str
), whether or not this is documented.If this is done, then
unwrap_or_default
could still be used, because it would only be falling back to the default empty value when the environment variable was absent.Most other behaviors, including the current behavior, if it is documented.
While any of those would, in my opinion, satisfy users' expectations and eliminate the bug as such, I think the best behavior would be to do (1) or (2). Specifically:
nativesymlinks
to a preexisting value ofMSYS
that ends in a high surrogate is effective at causing symlinks to be created, then (2) is preferred.I plan to investigate this and open a PR that makes one of those changes. (This investigation has been slightly slowed by the problem described in rust-lang/libz-sys#215, but I should be able to complete it.)
Update: I have opened #1580, which changes it to use
OsString
instead ofString
and to append withOsString::push
. As noted there, I verified that a value ofMSYS
produced this way does still have the desired effect on symlink creation.Git behavior
Not directly applicable.
Steps to reproduce 🕹
The nice way to reproduce this would be to make a fixture that displays the value, or to add a tracing statement, or to use a debugger.
Instead, I just added a
panic!
to print out whateverMSYS
value will be used:Then I ran a test that, while conceptually unrelated to this issue, exercises that code, ensuring the
panic!
is hit. The following output was warnings induced by making that change, about unused variables and unreachable code, removed. The full output can be seen in this gist.First, a control, with well-formed Unicode:
A now, when the existing value contains an unpaired high surrogate:
Therefore, normally it is able to keep content from the old value:
But when it cannot be converted to UTF-8, its silently discards it:
(The reason I added the
z
after the unpaired surrogate was to show a situation where it is not at the very end, which is the more common situation. The problem is not limited to when it is at the very end; rather, whether appending ought to be done, rather than panicking, should, I think, be determined by whether program built against the MSYS DLL are able to heed the appendedwinsymlinks
option even when it the space that delimits it follows an unpaired high surrogate.)The text was updated successfully, but these errors were encountered: