Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking for quotes around non-ascii usernames passed by Windows #208

Merged
merged 2 commits into from
Sep 29, 2020

Conversation

daladim
Copy link
Contributor

@daladim daladim commented Jun 8, 2020

This pull request fixes issue #168

When adding a network drive from the Explorer menu, Windows calls sshfs-win with the username as the 4th argument. When the username has non-ascii characters (e.g. Jérôme), Windows encodes it at UTF-8 and surrounds this username with quotes (thus, the 4th argument becomes "Domain\Jérôme"). When sshfs-win tries to turn this username-and-quotes into a uid/gid, it (expectedly) fails.
In the end, it does not correctly populates the permissions of the files and folders and makes the network disk read-only.

Coming from the Linux world, I am not accustomed enough to Windows yet to decide whether this behaviour is expected, and whether sshfs-win should cope with this, or whether Windows itself (or WinFsp?) should be told not to add quotes around a non-acii parameter.
In case it was sshfs-win's duty to check it, I have done it in this PR, which fixes the issue I had opened a few weeks ago.

@billziss-gh
Copy link
Collaborator

Looking at this now. Sorry for the very long delay.

@billziss-gh
Copy link
Collaborator

@daladim if you are still here, I am trying to understand why you would get double quotes for the 4th argument.

It is true that the WinFsp Launcher, which is the service that actually launches sshfs-win, passes the username in the 4th argument as a quoted string. This happens regardless of whether the username has UTF-8 characters or not. The expectation is that the command line parser that is embedded in sshfs-win.exe will parse the command line (thus stripping any extraneous double quotes) and then presents the correct arguments to main (the argv array).

There is a difference between how command line arguments are passed between Windows and UNIX:

  • On UNIX command line arguments are passed as an array of strings. Any necessary command line parsing is done by the shell (or other parent program), which then passes the string array to exec(3).
  • On Windows command line arguments are passed as a single string. Any necessary command line parsing is done by the receiving executable.

Now different Windows programs may handle command line parsing differently. In practice command line parsing happens in the program startup code and since most Windows programs are linked with the Microsoft runtime, the command line parsing is mostly consistent across Windows programs.

However sshfs-win.c is a Cygwin program, which has its own command line parsing mechanism (I believe). Perhaps this is why you are seeing those double quotes. I will try to dig a bit more into this.

@billziss-gh
Copy link
Collaborator

Ok, I can confirm some definite weirdness in the command line parsing of Cygwin programs. The C program below has been compiled with Cygwin gcc to produce cyg.exe and Microsoft cl to produce win.exe.

#include <stdio.h>

const char *GetCommandLineA(void);

int main(int argc, char *argv[])
{
    const char *s = GetCommandLineA();
    printf("C=%s\n", s);

    for (int i = 0; argc > i; i++)
        printf("%d=%s\n", i, argv[i]);

    return 0;
}

The Cygwin cyg.exe appears to incorrectly include the double quotes when the argument contains non-ASCII characters:

C:\Users\billziss\Projects\t>cyg.exe "foo bar"
C=cyg.exe  "foo bar"
0=cyg
1=foo bar

C:\Users\billziss\Projects\t>cyg.exe "Domain\Jérôme"
C=cyg.exe  "Domain\J□r□me"
0=cyg
1="Domain\Jérôme"

The Windows win.exe appears to correctly remove the double quote in all cases:

C:\Users\billziss\Projects\t>win.exe "foo bar"
C=win.exe  "foo bar"
0=win.exe
1=foo bar

C:\Users\billziss\Projects\t>win.exe "Domain\Jérôme"
C=win.exe  "Domain\JΘr⌠me"
0=win.exe
1=Domain\JΘr⌠me

So your patch is valid and works around a real problem in Cygwin's command line parsing. I will go ahead and submit a review for it next. Thanks!

@billziss-gh
Copy link
Collaborator

billziss-gh commented Sep 24, 2020

I have reviewed your patch. Your patch is correct in the general case, but it does not handle the case where a string starts with a double quote, but does not end with one.

I have slightly modified your patch to handle such cases. See below:

diff --git a/sshfs-win.c b/sshfs-win.c
index 9871e9c..83b5c29 100644
--- a/sshfs-win.c
+++ b/sshfs-win.c
@@ -192,8 +192,18 @@ static int do_svc(int argc, char *argv[])
         }
         else
         {
-            /* translate backslash to '+' */
             locuser = argv[3];
+            /* the Cygwin runtime does not remove double quotes around args with non-ASCII chars */
+            if (locuser[0] == '"')
+            {
+                size_t len = strlen(locuser);
+                if (len >= 2 && locuser[len - 1] == '"')
+                {
+                    locuser[len - 1] = '\0';    /* remove the trailing quote... */
+                    ++locuser;                  /* ...and the heading one       */
+                }
+            }
+            /* translate backslash to '+' */
             for (p = locuser; *p; p++)
                 if ('\\' == *p)
                 {

Please let me know what you think.

EDIT: Although I do not believe it is necessary, an alternative approach might be to remove any double quote characters found inside locuser as they are not valid characters for Windows user names anyway. We could do this in the loop that translates \ to +.

@daladim
Copy link
Contributor Author

daladim commented Sep 29, 2020

Thanks for all this research! That's very interesting to know.

Have you opened an issue (or a mail on whatever mailing list) for Cygwin? Do you want me to do it?

Your version looks good to me. Do you need me to test it tonight (or as soon I have time to?), or were you able to test it on your PC?

Thanks

@daladim
Copy link
Contributor Author

daladim commented Sep 29, 2020

I have just tested your version, which (as expected) works correctly on my machine.
I have just pushed your changes on this merge request, which looks mergeable now.
Thanks!

@billziss-gh billziss-gh merged commit 5bca969 into winfsp:master Sep 29, 2020
@billziss-gh
Copy link
Collaborator

@daladim I have merged this PR in. Thank you!

Have you opened an issue (or a mail on whatever mailing list) for Cygwin? Do you want me to do it?

Feel free to report this to them if you have the time. It is quite likely that the Cygwin folks will have a good rationale for this behavior.

@daladim
Copy link
Contributor Author

daladim commented Oct 2, 2020

I've left a message on their mailing list. In case you'd like to follow it, here it is: https://cygwin.com/pipermail/cygwin/2020-October/246445.html

@billziss-gh
Copy link
Collaborator

Thanks, I will monitor this thread.

@billziss-gh
Copy link
Collaborator

@daladim I am no longer subscribed to the Cygwin mailing list and cannot respond directly. (Is there an interface which I can use to send email to Cygwin without being subscribed?)

It seems to me that the list is missing the important point about the double quote characters that should NOT be there regardless of how the é and ô characters are being interpreted. (As evidence of this: the Cygwin command line parser was able to break the command line into arguments correctly, but chose to retain the double quotes.)

The choice of GetCommandLineA was for illustration purposes; had I used GetCommandLineW I would not be able to printf using %ls under CMD.EXE, because of code page issues. However here is a modified version of the test program that uses GetCommandLineW.

#include <stdio.h>

wchar_t *GetCommandLineW(void);

int main(int argc, char *argv[])
{
    wchar_t *s = GetCommandLineW();

    for (wchar_t *p = s; *p; p++)
        printf("%04x %c%s",
            *p,
            32 <= *p && *p < 127 ? *p : '.',
            (p - s) % 8 + 1 != 8 ? "   " : "\n");
    printf("\n");

    for (int i = 0; argc > i; i++)
        printf("%d=%s\n", i, argv[i]);

    return 0;
}

I compiled this program under Cygwin to produce cyg.exe and ran it under Cygwin and CMD.EXE.

Cygwin run:

billziss@xps:~/Projects/t$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=
billziss@xps:~/Projects/t$ ./cyg.exe "foo bar" "Domain\Jérôme"
0022 "   0043 C   003a :   005c \   0055 U   0073 s   0065 e   0072 r
0073 s   005c \   0062 b   0069 i   006c l   006c l   007a z   0069 i
0073 s   0073 s   005c \   0050 P   0072 r   006f o   006a j   0065 e
0063 c   0074 t   0073 s   005c \   0074 t   005c \   0063 c   0079 y
0067 g   002e .   0065 e   0078 x   0065 e   0022 "
0=./cyg
1=foo bar
2=Domain\Jérôme

CMD.EXE run:

C:\Users\billziss\Projects\t>\Windows\System32\chcp.com
Active code page: 437

C:\Users\billziss\Projects\t>cyg.exe "foo bar" "Domain\Jérôme"
0063 c   0079 y   0067 g   002e .   0065 e   0078 x   0065 e   0020
0020     0022 "   0066 f   006f o   006f o   0020     0062 b   0061 a
0072 r   0022 "   0020     0022 "   0044 D   006f o   006d m   0061 a
0069 i   006e n   005c \   004a J   00e9 .   0072 r   00f4 .   006d m
0065 e   0022 "
0=cyg
1=foo bar
2="Domain\Jérôme"

@daladim
Copy link
Contributor Author

daladim commented Oct 14, 2020

Thanks for your feedback. I also feel puzzled about the replies suggesting that everything works as expected.
I don't have as much time as I would like to be more helpful. For the time being, I have forwarded it into the mailing list :D

I guess you can reply to it without subscribing, but that would require your mail client to be able to reply to arbitrary messages (setting the In-Reply-To header with arbitrary message-ids), and I have no idea how easy it is. I am myself too lazy to do it properly (and to learn to use advanced mail clients), so I try to bend GMail the best I can to somehow suit this mailing-list format...

Thanks for your help

@billziss-gh
Copy link
Collaborator

I think you have gone above and beyond reporting this problem to them. If they choose to justify the behavior and ignore the problem that's on them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants