-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly deal with console codepages on windows #6795
Comments
I don't think we support any encodings other than UTF8. |
However UTF-8 is NOT a native-supported encoding for Windows console. You should add a conversion which converts CP-sensitive strings, or WCHARs if you use Unicode version API to read from console, into UTF-8. |
Libuv should do that for us. |
@loladiro I can confirm that pasting the above characters into the repl, they do not survive the round-trip. |
this feels embarrassing: I can no longer reproduce the problem. the only known change was installing the german language. however, it appears that the windows command prompt only supports the current codepage (which means we can't display these characters anyways) switching the codepage seemed to sometimes makes this work, even showing the right characters ( edit: apparently switch the codepage to 65001 once permanently fixes the issue for mintty and only worked the first time for cmd (all later attempts resulted in crashes), even though this setting is transient for the window |
Not sure whether this is the same issue, but I'm seeing only a subset of the new Latex-to-Unicode substitutions show up properly in cmd.
|
if you enter |
crashes the first time in my case |
@stevengj I am already using Conemu. And Node.js can handle codepages properly EVEN IN DEFAULT CONSOLE. |
We use the same backend as node, so we have the same support for the console code pages. However, the issue is that the console only works in the current code page -- it doesn't fully support Unicode. Switching the code page to UTF8 nearly works, but for some reason it also causes writing Unicode characters to return an error. Conemu is better, but that doesn't say anything +1 for shipping mintty (which node doesn't support :), as soon as someone figures out why the new repl waits for another character press, after getting a newline, before processing input. |
Conemu also claims to be Unicode-aware and support UTF-8. Although there are reports that Conemu doesn't properly handle combining characters. Any reasonable free-software console that we can ship would be fine with me. Frankly, even if we get Unicode output working in the default console, we should still ship a better console with Julia. When users double-click on the Julia program in windows, by default it should pop up a window that doesn't suck. Defaulting to the Windows console is shooting ourselves in the foot. |
Mintty has the advantage of being under 100k (compressed), whereas Conemu is around 2MB compressed. Oh, but the Mintty number above does not include the msys library, which is another 800k. |
@be5invis, I'm not suggesting that we prevent people from using other consoles. Just that we ship Mintty (or Conemu) and that it runs by default instead of the standard Windows console when you double-click Julia. You will still be able to type |
Whoops, accidentally deleted @be5invis's comment, sorry. |
@stevengj Also sorry for clicked "Close and Comment" button by accident. I tried some thing different: By pasting |
@be5invis, does Julia with |
@stevengj Under cp65001, pasting still works. I cannot test whether IME works, because that IMEs are disabled under cp65001 in default console. Under conemu, it stil crashes. |
I can't reproduce the problem. I just tried on a fresh Windows 8 x64 machine with Julia 0.3, and entering Unicode characters via |
I would also be happy to just default to the ijulia repl (also true for Mac) (Note mintty also requires stty.exe, but we a bundling a 200Mb copy of git, I don't think an extra meg will matter.) There are actually at least two disjoint issues here -- one is that entering characters doesn't always work correctly, the second is that Julia incorrectly reacts to write errors by closing the stream (causing it to crash shortly thereafter in raw!) |
Is it practical to ship IPython? Including Python seems like a can of worms... |
Maybe we should ship the windows installation with a Linux distribution that is booted when clicking on Julia.exe ;-) (Sorry I could not resist) More seriously: Does the power shell suffer from the same issues as the cmd.exe (regarding unicode) |
+1 for mintty for now. Based on my own experiences, I would be strongly opposed to packaging conemu. Personally I do not have much inclination to be in the business of packaging Python for Windows... But if we are going that way, I would prefer to make Julia conda-installable rather than distributing Python ourselves. |
Yes. It uses the same console. Note that even MS acknowledges the crapiness of the standard console; PowerShell ISE (installed by default on Enterprise, at least) has its own console instead. |
Okay, this is fixed in Julia 0.3 pre. |
@pao The default Windows terminal is designed for DOS compatibility. I think that if Gates does not force Windows be DOS-compatible, there will be NO standard CLI avaliable in Windows. ps. Console APIs DO support Unicode if you use proper API. |
Bill Gates has had little personal involvement in Microsoft for a while, but anyway. Bundling a better console is a good idea, but only when it's less buggy than the default console. Julia under Mintty is still buggy and not usable enough to be the default yet. IPython is still a few too many installation steps. I personally have no interest in using Python except when forced, so I'd rather have a Conda installation managed by Julia's package manager than the other way around. |
nice guess @stevengj. libuv already answered this in 2012 with that same result: nodejs/node-v0.x-archive#4246 perhaps we could do some voodoo with changing the font along with the codepage: http://msdn.microsoft.com/en-us/library/windows/desktop/ms686200(v=vs.85).aspx but then there's this gem: https://connect.microsoft.com/VisualStudio/feedback/details/543801/unicode-issues-with-writefile-and-in-the-crt so basically, the console API's support unicode, but the command prompt itself tries (and fails) to do everything in the system code page with a raster font using a broken libc in summary:
|
Maybe we need to ship a file with a unicode filename (kidding. kind of). |
@vtjnash I think that Julia 0.3 can handle Unicode inputs well. I tried |
Under CP936, CJK characters in REPL are misencoded. For example,
print("测试")
is encoded intoprint("\262\342\312\324")
, where0xB2E2
is the CP936 encoding of "测" and0xCAD4
is encoded "试".libuv's TTY adapter (for windows) might help.
The text was updated successfully, but these errors were encountered: