-
Notifications
You must be signed in to change notification settings - Fork 20.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
geth console special character/encoding problem #16286
Comments
Geth uses https://github.com/peterh/liner under the hood. I'm curious if how the little app here https://github.com/peterh/liner#getting-started would behave on your system. |
Hey @holiman, thanks for your fast reply and the hint about liner. I tested the code from https://github.com/peterh/liner#getting-started now. I compiled it with GOOS=windows GOARCH=amd64 go build -o peterh_liner.exe peterh_liner.go This test shows that liner alone is not affected by this problem. Maybe geth is using some special modes etc, but the little app does allow posting special characters. As you can see the keyboard layout in the screenshot above is my default one which failed with geth. With the little liner app I have no problems pasting special chars |
Thanks for checking! |
I decided that this problem is significant and risky enough (it could mean a lot of trouble for many users, because I guess a lot of users also use geth on windows with non-international keyboard layouts) so that I dedicated quite a lot of time today to troubleshoot this problem. After seeing that also the newly compiled geth binaries fails @ encoding, I inserted the example code from liner: https://github.com/peterh/liner#getting-started into the geth source code at a very early stage where geth is still initializing (I think I inserted it after this line go-ethereum/console/prompter.go Line 82 in 1488fda
The outcome: a total surprise! even if I inserted the code at the beginning of geth it still didn't work. no special characters pasted/allowed. My idea now was to compile the little liner standalone app (https://github.com/peterh/liner#getting-started) without any geth code, just to make sure that it still works (I compiled it within the go-ethereum directory. on windows directly). Again a surprise: now the little liner app also didn't work anymore! I tried to see if the old liner exe file I compiled (see previous posts) still worked, or if it was some terminal/keyboard language problem. I moved to the other geth directory and run it. Surprise again: it worked perfectly! Now I of course realized pretty quickly what was going on, geth ships its own source files of the peterh liner app see: https://github.com/ethereum/go-ethereum/tree/master/vendor/github.com/peterh/liner Funny coincidence that I was going through this changing-directory-decides-if-it-fails-or-succeeds mess, right? Anyway, putting the newer source code of liner (from github.com/peterh/liner) within the vendor/github.com/peterh/liner/ directory of geth made the geth binary working (with special character support) again for me! Success I also took some time, just for fun now, to locate the commit that fixed the problem in liner (this is the first version that works for me): (it's currently the latest commit in liner and the commit message talks about fixing problems with surrogate pairs etc. It's actually funny that this problem was fixed just 21 days ago in the liner repository. I think we should use the latest versions of liner if no problems are detected within geth during testing with the newest liner version) There is one big problem now that you shouldn't ignore. There are potentially many accounts now out there that can only be unlocked with specific versions of geth, i.e. those that ignore the special characters etc. The problem is similar to this one #2371. You can't just fix the problem, but you also need to make sure that the old accounts still can be unlocked. One possibility would be to test unlocking also with the "stripped special character" version of the password if the typed password (with special chars) did not work. I think this needs to be done on each and every supported platform even if the problem seems to be windows-specific, because it could happen that a user created the file with geth on windows and later on tries to unlock it with geth on linux/mac. I would also suggest to show some warnings and hints for instance when geth detects that the password did only work with the modified password and also that geth tells the user (it should explain why) it also tries with a modification because of this bug etc! I hope you appreciate this contribution / troubleshooting. It was actually a lot of messing around and trying to figure out wth is going on with this liner lib. If also just some people have less problems with passwords/encoding in the future because of this "research"/fixing it was definitely worth it. I even tried to test the encryption/decryption with password recovery tools just to make sure that with the newer liner version also the whole set of bytes of the password are used by the encryption algorithm as we would expect (UTF 8 encoded password with all special characters)... cracking the UTC-- files worked with all my tests now if the files were produced with the modified geth that was compiled with the liner fix! BTW: I just found out that this problem was actually already mentioned (and people were struggling with it) in other issues (a little bit hidden therefore I didn't find this post beforehand) and I think the root cause is exactly the same (namely the old liner code!): see ethereum/mist#3176 (comment) What we can learn from this: it often makes sense to update the included/imported libs, because there could be a lot of problems that were fixed with newer versions of them. Can you guys please suggest how you plan to properly fix this? liner update + code which also tries the modified password (at least for several future versions of geth) cc: @karalabe, @holiman |
Thanks for the investigation! We do update the 'vendored' libraries from time to time -- and I guess updating 'liner' would be good. IIUC, the TLDR is "Copy-pasting geth console in windows may have dropped certain special characters". Also, if I'm reading it right, this does not concern accounts that have been created through Mist. |
yeah exactly (but it also happens if you typed it of course, it's not a clipboard problem only) |
are there any updates here ? I find the thought quite disturbing that dozens (maybe much, much more) of users could potentially create accounts with geth each and every day that are generated with a different password compared to the password the user set. This is horrible and this should be addressed pretty quickly. As already mentioned, this problem also leads to other problems like the user can't use this keystore file with 3rd party tools like mycrypto/myetherwallet (because they don't mangle the password input in the same way the geth-liner-combination does) and you also can't use mist UI etc if the key was generated with geth beforehand (because the password, when we look at this specific bug, only gets screwed-up by the geth-liner combo). I think there is absolutely no plausible reason to wait until even more (hundreds/thousands?) of keystore files are generated "incorrectly" (with wrong password)... I reported the problem about 3 months ago and it seems the users noticed some strange behaviour even much, much earlier (though nobody seemed to identify the main cause before I opened this issue). Why risk that even more keystore files are incorrectly generated ? This should be escalated and handled very responsible and by considering all aspects of the changes (like I already mentioned: what if the keystore file was generated with the mangled password, but in upcomming versions geth uses the "correct" password input? both versions of the password - mangled/raw - must be tested, on each and every platform). You might have more numbers/stats available than I have, but my guess is that there are a lot of windows geth downloads and also a lot of users that do not use ASCII characters at all (their keyboard is non-english, uses non-latin characters in general !!!) and/or a lot of users try to set a difficult password with special characters/umlauts etc... all these cases lead to a INCORRECT keystore file, you know about the problems details since about 3 months and did nothing against it! This is a horrible fact and I find it very careless of you to not address problems like this with highest priority and uttermost diligence. I mean ... the exact problem was already carefully analyzed and reported and even the reason/fix was already proposed by me (i.e. update liner + test for mangled password) ... the only thing missing is the code that checks both versions of the password to not risk that all old/affected keystore files are completely useless (and to not risk that not only they don't work with 3th party software, but also do not work anymore with newer versions of geth... if the fix is not handling these special cases/keystore files). I think somebody needs to escalate this problem if nothing happens within the next few days/weeks. Maybe we should at least let the ethereum twitter and reddit community ask what they think of this bug and how this was (not) addressed within a timely manner. Finally, I just want to make sure that nobody gets this/me wrong: I'm not trying to blame somebody for this problem or accuse somebody of something... but thinking about hundreds/thousands of keystore files that are completely wrongly generated and the devs ignoring it for months makes me a little bit alerted and frustrated. I see no reason why devs do this to their users (maybe they just forget/ignored the issue, but I think it's horrible enough that it can't be simply ignored !). |
Gotcha, we're on it. It's a very peculiar circumstance where someone would get bitten by this, and the wallet could always be recovered again using the same setup as where it was first set, so as the path forward was a bit unclear, it slipped under the pile of things to do. We could add a 'canonicalizer', if a password fails and the 'canonical' version of the password is different from the password, we can attempt that too. |
Though, to be clear, I don't think all non-ascii characters are affected, but |
I've asked @kurkomisi try to repro this on a similar system. He will post the specifics, it seems we'll need to investigate further about what exactly is causing this issue, because while UTF-16 surrogate-pairs are dropped, characters such as |
I'm not sure what you mean by it works with Ó . The screencast above clearly shows that there is a problem also with that specific character on the system(s) I tested with. |
Yeah I didn't mean there was no issue, just saying that we failed to reproduce it on a windows 10 with geth 1.7.2 and US (Eng) keyboard. So we'll likely need to dig further. |
So, regarding the case you linked above, where a user thought he had used As I've understood this bug report (and I'm asking for confirmation here because we haven't been able to reproduce it).
Could you please, on your system, with the old |
Hey @holiman, first of all, thanks for your effort in troubleshooting this. I think the FusRoDah vs FußRoDah example is a little bit confusing indeed. As far as I understand, the user said that 0xAccountA with password FusRoDah (which is a complete independent account) does work without problems, while the account 0xAccountB with password FußRoDah does not work (but again account 0xAccountA and 0xAccountB are not related at all, you can not make any conclusion if either of them works, because there is no dependency, not even when we look at encoding or character codes etc). I think that the user just mentioned both cases, because as far as I know within German writings you sometimes (need to) replace the Eszett (sharp s) character, because some documents/forms do not allow special characters (in these specific cases I think ß is replaced by two s or something like this), but this is not really something we need to worry about here. ... but I also need to admit that now reading the post again carefully, the reporter said that it never worked for them, not even without the sharp s character. I didn't yet test the esszett example myself, I will of course do it as soon as possible because this might be somehow related here (or also completely independent and maybe the other post is not related with this issue at all. we will see). I agree that the solution might be easy, but you still need to somehow deal with old keystore files. I'm not sure why you can't reproduce it. I do not think that it has something to do with 0x7f, but with characters that are at least > 0xff or > 0xffff (utf-16 for instance?, I'm not yet sure what the first character is that is ignored by older versions of liner, but I will test this too). Stay tuned... update: update 2: I just noticed that there might be other (very little older) liner fixes too that might be important for this issue: see peterh/liner@e1deee0 o.O btw: I'm still trying to see what ranges of characters work and which do not, but it is kind of non-intuitive to me... it's not really a threshold code value in the extended ASCII/utf-8 character space... it works for some special character, then there are some that do not work and then it start working for some codes again and it fails again etc... I was thinking if there is somehow a different code page or character encoding involved that might explain this behaviour (btw: my cmd says that it is using 437 (OEM - United States), not sure if that matters... but it is still kind of "random" to me which character work and which do not) update 3: while trying to debug this problem, it turned out that this "barrier" is the main trigger of the issue I described above i.e. my particular test cases: https://github.com/peterh/liner/blob/e1deee0acfb34066ae65e7daa79265f8533c5546/input_windows.go#L197-L199 btw: my debuggings show that the section of code that is run for my specific test cases is always this one: peterh/liner@161ea9c#diff-8ea1b116174305efd1aa198c64cb38cfR210 i.e. it is "just" a normal ke.Char code (for instance for Ó the value is 211), but without keyDown set and with ke.VirtualKeyCode == vk_menu (therefore in this particular case it is no surrogate or anything like that and also the case surrogate > 0 code sections etc are not used with the Ó etc characters). therefore my conclusion after this quite intensive troubleshooting and testing session is that depending on the keyboard layout there could be some characters (in my tests always with character codes > 0x7f) that are send via the ALT+unicode method instead of the "normal" keydown event. Depending on the keyboard language selected, there might be some codes that are > 0x7f that do not use the ALT method (common characters for that specific language?). |
This must be a joke. Are you serious ? Why was this issue closed ? We've discussed in depth within the above posts that you can't just update the dependencies and do as if the wrong-password bug with special characters was never a thing and that this bug never existed. This is similar to the bug here #2371 where the fix was also completely wrong, if you look at the user experience... That old fix for instance (also) made it even worse, because now the users can't even test with the old password (with wrongly added newline characters in that specific case). Why are people even wondering why threads like this with user claiming that geth/mist set the "wrong password" ethereum/mist#3513 are that long and very confusing/weird if you are that reluctant to fix the problem properly and address all points that were raised above ? This is not how you fix issues, you should know better. You are risking to lock people out of their wallets and making it even more annoying to understand what the real password (versus the incorrectly used password set by geth) is. cc @holiman , @evertonfraga |
We decided that the right approach would be to update the liner to behave correctly going forward. That said, it's true that there might be corner-cases where people are using a wallet with a 'corrupt' password. However, this bug has nothing whatesoever to do with anyone who has set password via Mist: this concerns only this case:
So it's probably better to use the correct liner, and maybe provide an additional binary, e.g. Suggestion: The
The user could then copy-paste his password, and see what the corresponding hexdata is. He could then store that (binary) data into a password file, and use for unlock. Such a binary could also help us discover more information about this corruption, and eventually add the required rules/manglings to adjust for this quirk internally within geth/clef. How does that approach sound? |
I mean that would be a first start, but it doesn't really solve the problem and is also not really user friendly and very practical. The main problem is that there might be a considerable large amount of windows users affected by this problem that for instance used password managers to generate the password with special characters etc and that won't even think about trying to test their password anytime soon ("hodlers" ?). They might try to unlock their account in a couple of months/years and won't get any hint why the password is not working anymore (yeah, I saw that you are planning to add a line within the changelog about this, but who reads the changelog anyway ?) There must be at least some kind of warning if the password does not work within the console itself and when the raw input/password meets the pre-requisit for this bug: windows user/special ALT characters |
Please reopen this github issue. The problem is not fixed. The patch made the problem even worse for the users affected. |
System information
Geth version:
1.7.2
OS & Version: Windows 10
Commit hash : release version downloaded by Mist (1db4ecd)
Expected behaviour
Copy pasting text into the geth console should leave the copied text unmodified and should not strip any characters.
Actual behaviour
Special characters (non-ascii) will be ignored/stripped by the geth console (it seems that this has to do with certain keyboard languages, but it works outside of the geth console) and this could result that passwords are set incorrectly. Even unlocking accounts with passwords including special characters might fail even if it shouldn't, see also ethereum/mist#3513 (comment) (search for "Let's talk about problem number 2:" within that post to fast forward to problem number 2 within that post)
Steps to reproduce the behaviour
Note: I want to emphasize that the copy pasting only fails to work within geth, other tools and the same terminal (cmd) are able to get the whole input including special characters
Confirmation video (also see screenshots from ethereum/mist#3513 (comment)); this is a screen record to show what is happening:
Note: before I started geth.exe the copy pasting of all characters worked (even with my default English US keyboard layout), but it failed within the geth console... it suddently started working when changing to the English United States International keyboard layout... it seems to be geth related because it works outside geth with each and every keyboard layout.
It is needless to say that this same behaviour can be observed with any other geth commands and also with the password prompts (which is very dangerous! and the main reason that I've discovered this problem, see ethereum/mist#3513) etc. I decided to paste the character as a command (or let's say directly within the geth console, without using the password prompt etc) because it makes it more clear what is going on because this way we can see which characters are actually pasted and which ones are ignored
What is the effect of this problem:
.... etc
CC: @evertonfraga, @karalabe, @holiman
The text was updated successfully, but these errors were encountered: