Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser refactor and optimization #462

Merged
merged 29 commits into from
Jan 18, 2017
Merged

Conversation

Tyriar
Copy link
Member

@Tyriar Tyriar commented Jan 9, 2017

Fixes #459

This PR refactors the parser to improve performance and maintainability. Here are the specifics:

  • Pull parsing logic from xterm.js into Parser.ts

  • Use a set of maps when parsing so that the operation is O(1) not O(n), where n is the number of possibilities, this is the main performance gain

  • Pull all input/code handling into InputHandler

  • Pull charsets into Charsets.ts

  • Remove empty functions and commented out code for unimplemented codes, this old code will always be available by looking into git history or at chjj/term.js

  • By looking at the top of Parser.ts you can now see much more easily what is implemented and what is not, eg. p is implemented for prefix ! only:

    csiStateHandler['p'] = (handler, params, prefix) => {
      switch (prefix) {
        case '!': handler.softReset(params); break;
      }
    };

Parsing performance of regular characters is expected to improve ~20% (much more when parsing CSI codes). These are things I have in mind for what to follow this up with:

  • Implement more codes much more easily (eg. Support set cursor style (DECSCUSR) #480)
  • Implement other charsets Support other character sets #481
  • Add parser tests much more easily due to IInputHandler and separation of Terminal and Parser (also fix the escape-sequences-test.js tests we're not passing)
  • Link Terminal and InputHandler closer, perhaps using an InputHandlingTerminal which extends Terminal and implements IInputHandler?
  • Improve InputHandleraddChar, pull wcwidth into own module, etc.
  • Move towards removing Parser's dependence on Terminal so that the Terminal requests parsing from Parser which passes everything along to InputHandler
  • Move parser variables (params, currentParam, prefix postfix, etc) into Parser`
  • Move more out of xterm.js (eventually creating Terminal.ts and xterm.ts (entrypoint) which exposes it

@Tyriar Tyriar added the work-in-progress Do not merge label Jan 9, 2017
@Tyriar Tyriar self-assigned this Jan 9, 2017
@Tyriar Tyriar requested a review from parisk January 9, 2017 00:23
@Tyriar Tyriar force-pushed the 459_parser__on_460 branch from f7710a4 to 672dc1a Compare January 9, 2017 23:17
@Tyriar Tyriar force-pushed the 459_parser__on_460 branch from 49ece2d to 7572dd5 Compare January 9, 2017 23:40
@Tyriar Tyriar changed the title Parser refactor Parser refactor and optimization Jan 11, 2017
@Tyriar
Copy link
Member Author

Tyriar commented Jan 11, 2017

I wrote a little benchmark and this PR is already showing great improvements! The regression below is probably due to overhead in funneling the calls through InputHandler -> Terminal which is a temporary situation.

Before:

  Parser
500000 iterations time: 778ms
Average iteration time : 0.001556ms
    ✓ Single character parse performance (779ms)
500000 iterations time: 18131ms
Average iteration time : 0.036262ms
    ✓ Simple character long line parse performance (18132ms)

After:

  Parser
500000 iterations time: 891ms
Average iteration time : 0.001782ms
    ✓ Single character parse performance (891ms)
500000 iterations time: 15435ms
Average iteration time : 0.03087ms
    ✓ Simple character long line parse performance (15435ms)

On a related note, phantomjs unit tests are fairly easy via mocha-phantomjs.

@parisk
Copy link
Contributor

parisk commented Jan 13, 2017

Great, I think that the one-char overhead won't be even noticed, so we don't have to worry about it 👍 .

@Tyriar
Copy link
Member Author

Tyriar commented Jan 13, 2017

The overhead also appears to have gone away for the most part which long lines with no escape sequences are ~20% faster.

@parisk
Copy link
Contributor

parisk commented Jan 16, 2017

This PR seems to have taken a great course 👍 .

The whole process of input handling is much cleaner right now.

@Tyriar do you have any to-dos in your mind that have to be completed before moving on with reviewing an merging this? If yes, can you put them in the description?

@Tyriar
Copy link
Member Author

Tyriar commented Jan 16, 2017

@parisk I don't think so, there are some TODOs in the code but they are questions and/or can be deferred. I'll update the main PR comment to reflect current state (ready for review) and what is included.

I just did an implementation of CSI Ps SP q #480 which demonstrates what makes up a CSI code implementation now which will probably help you understand the flow Tyriar@1a67893

@Tyriar Tyriar removed the work-in-progress Do not merge label Jan 16, 2017
@Tyriar
Copy link
Member Author

Tyriar commented Jan 16, 2017

@parisk updated

Copy link
Contributor

@parisk parisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some documentation needed.

@@ -0,0 +1,58 @@
// TODO: Give this a proper type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add license declaration based on #388?

@@ -0,0 +1,1522 @@
import { IInputHandler, ITerminal } from './Interfaces';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add license declaration based on #388?

import { C0 } from './EscapeSequences';
import { CHARSETS } from './Charsets';

export class InputHandler implements IInputHandler {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add JSDoc comment please?

// TODO: We want to type _terminal when it's pulled into TS
constructor(private _terminal: any) { }

public addChar(char: string, code: number): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add JSDoc comment please?

* BEL
* Bell (Ctrl-G).
*/
public bell(): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in the JSDoc comment what this method is about (ringing the "bell" of the terminal).

this._terminal.currentParam = param;
}

public getParam(): number {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little bit more in the JSDoc comment what this method does and what does it return?

return this._terminal.currentParam;
}

public finalizeParam(): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little bit more in the JSDoc comment what this method does?

this._terminal.currentParam = 0;
}

public setPostfix(postfix: string): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little bit more in the JSDoc comment what this method does and document arguments as well?

this._terminal.postfix = postfix;
}

public skipNextChar(): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little bit more in the JSDoc comment what this method does?

this._position++;
}

// public repeatChar(): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete these comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this in as it will be needed when we convert more to use maps, comments are removed not from output so it doesn't hurt us at all.

@parisk
Copy link
Contributor

parisk commented Jan 18, 2017

Great job @Tyriar! The code is much more readable right now and maintainable. Tests run successfully and everything seems to be working great.

The only thing missing is some documentation comments and I think we can move on with merging 🚀 .

@Tyriar
Copy link
Member Author

Tyriar commented Jan 18, 2017

@parisk I added jsdoc, I didn't document the individual input handler methods but instead in InputHandler's class comment referred to http://invisible-island.net/xterm/ctlseqs/ctlseqs.html which gives more context.

@Tyriar
Copy link
Member Author

Tyriar commented Jan 18, 2017

Here goes 😅

@Tyriar Tyriar merged commit 5bd8033 into xtermjs:master Jan 18, 2017
@Tyriar Tyriar deleted the 459_parser__on_460 branch January 18, 2017 18:49
@Tyriar Tyriar modified the milestone: 2.3.0 Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pull parser into its own class and improve performance
2 participants