Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF surrogate pairs broken #3055

Closed
gramster opened this issue May 14, 2012 · 9 comments
Closed

UTF surrogate pairs broken #3055

gramster opened this issue May 14, 2012 · 9 comments
Assignees
Labels
closed-as-intended Closed as the reported issue is expected behavior

Comments

@gramster
Copy link

These seemed to be working a few days ago, and still work when using tools/build.py/test.py, but no longer work in the editor. For example,

'\uD800\uDC00'

In the editor I now get:

Multiple markers at this line
    - Unexpected token 'ILLEGAL'
    - Expected ',' or ')', but got
     'ILLEGAL'

This is with editor build 7552.

@devoncarew
Copy link
Member

You can reproduce the problem w/ this snippet:

void main() {
  var bar = '\uD800\uDC00';
}

while this code doesn't exhibit the problem:

void main() {
  var bar = '\u0041\u0041';
}

I think it's an issue in the scanner.


Set owner to @scheglov.
Added Area-Editor, Triaged labels.

@scheglov
Copy link
Contributor

  1. I don't believe that this worked "a few days ago". I see code in DartScanner which fires this error, and it was changed last time in February.
  2. I don't see in spec anything about surrogate pairs. But I see \u{HEX DIGIT SEQUENCE}. So, I think that you should specify valid Unicode scalar value, not UTF16 surrogates.

This code works:
var foo = '\u{10000}';
var bar = '\u{10FFFF}';


Added AsDesigned label.

@gramster
Copy link
Author

  1. I am working on uriEncode and uriDecode library functions. I have unit tests for these that use the '\uD800\uDC00' string. That unit test compiled and passed last week in the editor. Believe what you will; that is what I saw. That test now will not compile in the editor, but it will compile and run fine from the command line.
  2. Given 1, we now have inconsistent behavior. I can use '\uD800\uDC00' when running tests from tools/build.py, tools/test.py, but not in the editor. We should be consistent.

@gramster
Copy link
Author

BTW I just checked and I still have the project I created in the editor last week that worked. Here is the exact line that worked last week:

    test("\uD800\uDC00", "%F0%90%80%80");

@scheglov
Copy link
Contributor

  1. Well, I just don't believe in miracles. If DartScanner was not changed, then its behavior was same. What you may be saw is that there was bug in Editor which masked displaying errors (not this specific - all errors and warnings) and now this bug was fixed.
  2. I think only indisputable authority is spec, not just some implementation. Spec has \u{hex_digits}, but not surrogate pairs. I've asked Gilad to clarify this issue.

@scheglov
Copy link
Contributor

Added NeedsInfo label.

@gbracha
Copy link
Contributor

gbracha commented May 15, 2012

As the spec stands, the the hex sequence must be a valid Unicode scalar value. Apparently D800 is not such a value. So the code is illegal.

If this works in one implementation and not another, then we may have bugs - but fixing them will simply ensure that this code is uniformly rejected.

The real question is does it make sense to change the spec (and to what?). I will consult with our local Unicode gurus tomorrow.

@scheglov
Copy link
Contributor

VM was fixed to reject such code.
http://codereview.chromium.org/10388179/

@scheglov
Copy link
Contributor

Added AsDesigned label.

@gramster gramster added Type-Defect closed-as-intended Closed as the reported issue is expected behavior labels May 22, 2012
dart-bot pushed a commit that referenced this issue Sep 13, 2021
New commits in this version:

git -C third_party/pkg/pub log --oneline cd7a43f2109f7e5eb22e73c7f4e15d25fd57598e..d95c5713dda518ed53ada70e00789e6aadbfbe48
d95c5713 (HEAD, origin/master, origin/HEAD) Remove duplicate global invocation on Windows (#3055)
4c5198df master->main (#3101)
e793fd52 More tool/test.dart tweaks (#3097)
1b228edd Report retracted packages (#3093)
4fedb6c5 Tweak strict dependencies error message for `_validateBenchmarkTestTool` (#3087)
e608ab6e Improve test script (#3091)
abf702c4 Detect potential leaks in `dart pub publish`. (#3049)
9941c1f9 Fix broken simplification of prerelease constraints (#3078)
f0cdfa27 format (#3086)
5a1242c6 Fix unicode terminal detection windows (#2933)
58e2296d Dart format (#3084)
1426601c use incremental compilation in the tool/test.dart script (#3075)
9954f851 Fix a typo (#3062)
44489b31 Use relative import for path.dart (#2959)
77702ab1 Always precompile scripts before running them (#3074)

Change-Id: I913ab9e3b5bc7181d483a2de96ec4ad917028b75
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/213262
Reviewed-by: Jonas Jensen <[email protected]>
Commit-Queue: Sigurd Meldgaard <[email protected]>
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-as-intended Closed as the reported issue is expected behavior
Projects
None yet
Development

No branches or pull requests

4 participants