-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Swift) Incorrect highlighting in operator functions #2895
Comments
I'd try to tell you but you gave me screenshots instead of actual code I could copy and paste. :-p :~( |
Sorry! protocol Comparable: Equatable {
static func < (lhs: Self, rhs: Self) -> Bool
static func <= (lhs: Self, rhs: Self) -> Bool
static func > (lhs: Self, rhs: Self) -> Bool
static func >= (lhs: Self, rhs: Self) -> Bool
} Now that I'm more familiar with the grammar, the problem seems to be that the grammar doesn't support protocol requirements. It assumes that the requirement |
Yes, quite right. A few thoughts:
|
Ah you're also going to fall afoul the templating(?) rule |
Is there a way to express a specific ordering of patterns? For example, in a function declaration, the function's name must come before the generic parameters, so in this example: func < <T>() The first |
W.r.t. the TITLE rule, I was looking at the grammar for identifiers in Swift: This seems doable to implement. However, I'm wondering if the 5-hex-digit Unicode characters (such as U+9FFFD) are supported? I'm finding contradicting information regarding |
We have
It might even be simpler to just handle all the operators this way and avoid even touching TITLE. Of course I'm not sure what other syntax Swift has that might get in the way of this or not.
Regarding TITLE I was only talking about fixing it to match the operators... all those unicode characters might be a bit out of scope and better solved with #2756. There are one or two places we use Unicode stuff in grammars now; ie, java: var JAVA_IDENT_RE = '[\u00C0-\u02B8a-zA-Z_$][\u00C0-\u02B8a-zA-Z_$0-9]*'; But as you can see that's very minimal and clean usage. If a grammar instead needs 50 different unicode match rules (as Swift seems to) then it's time to consider a different approach. |
I think it's important when working on grammars to remember the desired goal is to be useful not necessarily correct. Often correctness is impossible (or too hard to accomplish) so we just do the best we can to provide the best highlighting in the most circumstances. |
Is there a (performance) downside to writing it correctly? I've noticed the TextMate grammar uses the following rule for identifiers: I assume that's acceptable? |
That seems a bit much. I'd consider a smaller subset or simplifying the rule to make it "wider" such that is still mostly works even if not 100% correct. This being a case of size/complexity vs correctness. Also anytime I see this UTF-8 stuff I wonder if it's better solved by some of the new UTF-8 stuff we haven't gotten into yet.
Well, usually I'm more concerned about size/complexity since we don't a benchmark suite, but yes performance is also a consideration. More complex grammars take longer to execute than less complex grammars. Some of the TextMate grammars are like 1mb of rules... that's simply not what we're aspiring to do here, so sometimes we look at a thing and go "we're going to have to take a pass on that".
Pretty sure none of this is supported at all unless we're using the new ES2015 |
I'm also trying to find some middle ground. I just want to get rid of incorrect highlighting that may confuse readers. My current plan is try to to implement the grammar for types — especially function types — and their prerequisites, such as operators, identifiers, and attributes. I'm hoping that will allow me to detect the end of a function declaration, even if there's no body, which will fix this issue. |
Sure, just it can be hard to balance sometimes. Historically we err on the side of highlighting "more things" with "good" accuracy vs highlighting less but with 100% accuracy. For example see our regex support in many languages which is often Simply never highlighting regex because we get it VERY wrong 1% of the time would be a huge loss.
Often newer contributors over-engineer their initial solutions. :-) Trying to build language parsers instead of "do the simplest thing". If the regexs for all this is straight forward that might be ok, but in many cases an equivalent thing can often be achieved with a "one line hack"... such as adding additional "short-circuit" characters to the end mode... and often those hacks are much, much easier to maintain long-term than whatever the far more complex "correct" solution would be. So I always first look for the "simple" solution to solve the "issue at hand" vs trying to assemble a 100% accurate Swift parser. |
Quick question here: does the |
I have a quick fix for this ready :) |
Could we make these separate PRs? It's super hard to constantly review a huge PR that keeps growing. It would be much simpler to freeze it, get it solid... and then add these things as smaller and much easier to review PRs that come after. |
I'm not sure what you mean. When in a given mode the contains list + the end expression (plus
This can work in some cases if it doesn't get too complex and if the rules are distinct enough that lack of sequencing is not an issue - as we do not provide great tools for X, then Y, then Z type expressions. |
I suggested combining these issues into a PR because they both involve rewriting the FUNCTION mode. W.r.t. the end/contains, I was wondering in what order these are attempted. First end, then contains? First contains, then end? Or undefined? I was just trying to think of a way to match rules like "may be followed by ..." when there is no clear end character, just the lack of ... following it. |
Yes I was only saying not to mix them with the existing PR... it's already too large.
No.
Huh? I dont' follow. |
Ah yes, the current PR is definitely too large 😅
Currently, the function However, now that I can distinguish a function title of |
Those things don't necessarily need to be handed inside the scope of "function" though. Context-free grammars are better (much easier to debug and maintain) than context-filled grammars, when possible.
And it just may. I think you're saying the same thing I just said. |
…ipts (#2930) This PR improves highlighting for functions, initializers, and subscripts. This includes detailed highlighting for generic parameters as well as function parameters. To differentiate between tuple elements and function params, I've also added a mode to match tuples, whose element names should not be highlighted as params. - Fixes #2895 - Fixes #2857
What exactly is going on here?
@joshgoebel I’ll try to improve the Swift grammar a bit during the holidays. I'll start by looking into the JS regex syntax, then reading the HLJS developer documentation Can you recommend some grammars I should look at for inspiration?
The text was updated successfully, but these errors were encountered: