Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent .op parsing for unicode operators #9684

Closed
stevengj opened this issue Jan 8, 2015 · 6 comments
Closed

inconsistent .op parsing for unicode operators #9684

stevengj opened this issue Jan 8, 2015 · 6 comments
Labels
parser Language parsing and surface syntax unicode Related to unicode characters and encodings

Comments

@stevengj
Copy link
Member

stevengj commented Jan 8, 2015

As discussed on the mailing list, there is an annoying inconsistency in the parser:

julia> parse("5.≤x")
:(5.0 ≤ x) 

julia> parse("5.<=x") 
:(5 .<= x)

It would be nice to fix this by modifying the parser to treat . followed by any Unicode operator in the same way, so that we get "dot" versions of all the Unicode operators rather than having to manually add each operator twice. See also #6929 (comment)

@jiahao jiahao added parser Language parsing and surface syntax unicode Related to unicode characters and encodings and removed unicode Related to unicode characters and encodings labels Jan 8, 2015
@stevengj stevengj added the unicode Related to unicode characters and encodings label Jan 8, 2015
@jakebolewski
Copy link
Member

As noted on the mailing list this particular example is due to julia's space sensative parsing but operations like:

julia> parse("5 .⊕  10")
ERROR: ParseError("extra token \"10\" after end of expression")
 in parse at string.jl:1257
 in parse at string.jl:1267

could be parsed correctly.

@stevengj
Copy link
Member Author

stevengj commented Jan 8, 2015

@jakebolewski, look at the example more closely. The problem is not that it is space-sensitive, the problem is that .≤ and .<= are treated differently.

.⊕ is different because it is not treated as an operator at all, regardless of spacing. Although this is a separate problem, my feeling is that the fix is to parse .op consistently regardless of op, and that solution will kill two birds with one stone.

@jakebolewski
Copy link
Member

I've added the missing unicode operators that were special cased by the parser in 1101086. I agree a more general solution would be nice.

@tkelman
Copy link
Contributor

tkelman commented Jan 9, 2015

backport pending label for 1101086 - it's harder to lose track of as an issue label than a commit comment

@stevengj
Copy link
Member Author

stevengj commented Jan 9, 2015

Closing this, as the immediate inconsistency is fixed. If people want more dot parsing of unicode operators (not to mention operators with combining characters like ), that can be a separate issue.

@stevengj stevengj closed this as completed Jan 9, 2015
jakebolewski added a commit that referenced this issue Jan 9, 2015
jakebolewski added a commit that referenced this issue Jan 16, 2015
(cherry picked from commit 9322f20)

Conflicts:
	test/runtests.jl
@tkelman
Copy link
Contributor

tkelman commented Jan 16, 2015

backported in 68d11e4, and tests in a09b2fa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Language parsing and surface syntax unicode Related to unicode characters and encodings
Projects
None yet
Development

No branches or pull requests

4 participants