-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parse timing by separating consume() into fetch() and consume() #2054
Conversation
Fix KaTeX#1989, and generally cleanup parse timing (many fewer hoops to jump through) by defining two methods on parser: 1. `consume()` marks the current token (`nextToken`) as "done", but doesn't yet fetch the next token (setting `nextToken` to `null`). 2. `fetch()` fetches the next token if we don't already have one (e.g., if last token was `consume()`d). Before this change, `consume()` did both actions. By separating them, and allowing the parser to live in a state for a while where `nextToken` is `null`, it is far easier to change settings (in particular, math/text mode and catcodes) before reading the next token, in a way that depends on what we're parsing. For example, if an argument should be treated in text mode, we can just set the mode in the argument parser, instead of when the previous token was consumed. Similarly, if an argument should be treated as a URL, we can just set the catcode of `%` in the URL argument parser, and reset it after. We no longer have to take care to reset things before calling `consume()`. This change mostly involves changing `this.nextToken` to `this.fetch()`. In a perfect world, we could use slightly fewer calls to `this.fetch()`, but Flow doesn't realize that `this.nextToken` will be non-null after a call to `this.fetch()`, so we need to use a few more calls to `this.nextToken()` or a few more local `nextToken` variables.
Codecov Report
@@ Coverage Diff @@
## master #2054 +/- ##
=========================================
- Coverage 94.83% 94.8% -0.03%
=========================================
Files 81 81
Lines 5281 5277 -4
Branches 923 924 +1
=========================================
- Hits 5008 5003 -5
- Misses 249 250 +1
Partials 24 24
Continue to review full report at Codecov.
|
This is great! I'll try to review this sometime this week. |
Thank you for the PR. This would allow manipulation of |
@ylemkimon Thanks for the review! I do think this will be a helpful step for parsing. I hadn't thought about #924 in a while, but it would be great if it helps with that! |
Fix #1989, and generally cleanup parse timing (many fewer hoops to jump through) by defining two methods on parser:
consume()
marks the current token (nextToken
) as "done", but doesn't yet fetch the next token (settingnextToken
tonull
).fetch()
fetches the next token if we don't already have one (e.g., if last token wasconsume()
d).Before this change,
consume()
did both actions. By separating them, and allowing the parser to live in a state for a while wherenextToken
isnull
, it is far easier to change settings (in particular, math/text mode and catcodes) before reading the next token, in a way that depends on what we're parsing. For example, if an argument should be treated in text mode, we can just set the mode in the argument parser, instead of when the previous token was consumed. Similarly, if an argument should be treated as a URL, we can just set the catcode of%
in the URL argument parser, and reset it after. We no longer have to take care to reset things before callingconsume()
.This change mostly involves changing
this.nextToken
tothis.fetch()
. In a perfect world, we could use slightly fewer calls tothis.fetch()
, but Flow doesn't realize thatthis.nextToken
will be non-null after acall to
this.fetch()
, so we need to use a few more calls tothis.nextToken()
or a few more localnextToken
variables.I'm kind of impressed how such a major change could be relatively easy to do. It's been extremely helpful to have so many tests and type checking in our codebase. For reference, I previously tried to change all the methods to load the token when they need them, and not bother marking used tokens as consumed, so that
nextToken
is nevernull
. This proved very difficult (maybe impossible) because we sometimes get nested deep withinparseExpression
and a single token needs to trigger all of them to terminate (e.g.,\fbox{$\displaystyle x$}
, which is what\boxed{x}
expands to, the last$
token ends the\displaystyle
implicit group and the math mode). So I opted for this approach, which is a much smaller (and more obviously correct) change to the code.