Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows 10 reports invalid UTF-8 on the poem in Ch12-02 #1307

Closed
homeisfar opened this issue Apr 8, 2018 · 4 comments
Closed

Windows 10 reports invalid UTF-8 on the poem in Ch12-02 #1307

homeisfar opened this issue Apr 8, 2018 · 4 comments
Assignees
Labels
Milestone

Comments

@homeisfar
Copy link

I've encountered a fairly minor issue while going through Chapter 12. In 12-02, I copied the poem to follow along with the book example. However, the poem contains what Rust/Windows considers to be invalid UTF-8. The relevant part of the poem is below.

I’m nobody! Who are you?
Are you nobody, too?
Then there’s a pair of us — don’t tell!
They’d banish us, you know.

When running my code example that follows the code in the book, read_to_string() fails:

thread 'main' panicked at 'something went wrong reading the file: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }', libcore\result.rs:945:5

Using WSL on Win10, I tried to validate the text file like so:

iconv -f UTF-8 poem.txt
Iiconv: illegal input sequence at position 1

iconv reported invalid UTF-8.

The offending character is the apostrophe, which is 0x92. When I switched all four apostrophes to 0x27 the code ran successfully. The change is simple (note the slightly different apostrophe):

I'm nobody! Who are you?
Are you nobody, too?
Then there's a pair of us — don't tell!
They'd banish us, you know.

The original version seems to work fine on my Mac, and iconv validated the UTF-8 successfully there. I believe this is a Windows-only issue.

@steveklabnik
Copy link
Member

maybe related to #1129

@carols10cents
Copy link
Member

Ewwww. I'm going to leave this open for a little bit; we get one more check of the entire book before print and I want to make sure the nostarch ebook doesn't have this problem. I've noted this in #1321 which is where I'm keeping track of the last batch of changes we're going to ask for when we do this final review with nostarch.

Thank you for the thorough bug report!

@Kirszu
Copy link

Kirszu commented Apr 25, 2018

I have the same problem and in my case it's not only the apostrophe.
The long dash ( — ) symbol in:

Then there's a pair of us — don't tell!

is also throwing the error.
I'm using Windows 8.

@carols10cents
Copy link
Member

Ok, I had someone copy paste from the nostarch pdf and it worked, so I don't think that has the problem. Pushing a fix for the online version in a bit.

carols10cents added a commit that referenced this issue Mar 1, 2021
This isn't always recognized as UTF-8 on all platforms after
copy-pasting.

Fixes #2606
Related to #2539, #1307, #1533
joeljpresent pushed a commit to joeljpresent/rust-book-fr that referenced this issue Apr 13, 2021
This isn't always recognized as UTF-8 on all platforms after
copy-pasting.

Fixes rust-lang#2606
Related to rust-lang#2539, rust-lang#1307, rust-lang#1533
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants