Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Flesh out the regex search APIs #35

Merged
merged 3 commits into from
Sep 29, 2017
Merged

Flesh out the regex search APIs #35

merged 3 commits into from
Sep 29, 2017

Conversation

maxbrunsfeld
Copy link
Contributor

@maxbrunsfeld maxbrunsfeld commented Sep 29, 2017

Problem

While @leroix and I were looking into the performance of the new findWordsWithSubsequence method and its use in autocomplete-plus, we noticed that a huge amount of time was being spent in the Cursor.getCurrentWordBufferRange method.

There are a few reasons why this method is slow:

  • It uses the old TextBuffer.scanInRange API rather than a native search API from superstring.
  • The scanInRange APIs are especially inefficient when they are passed RegExps that can potentially match across line breaks. Currently, RegExps that contain negated character classes (e.g. [^x] are assumed to be potentially multi-line.
  • The word-regex used by getCurrentWordBufferRange uses a negated character class based on the editor.nonWordCharacters config setting.

Solution

We should optimize scanInRange, ideally using superstring's native search functionality.

First step

This PR expands the set of search APIs. The final list of search APIs will be as follows:

  • find
  • findSync
  • findAll
  • findAllSync
  • findInRange
  • findInRangeSync
  • findAllInRange
  • findAllInRangeSync

Before actually changing TextBuffer.scanInRange, I'm going to update Atom to use the native API just for Cursor.getCurrentWordBufferRange. That will fix the most immediate performance problem. Then later we can take on the more risky task of updating scanInRange.

/cc @nathansobo

@winstliu
Copy link

winstliu commented Sep 29, 2017

Ooh, this should really help performance for bracket-matcher's HTML tag matching as well!

@maxbrunsfeld
Copy link
Contributor Author

this should really help performance for bracket-matcher's HTML year matching

Yeah, I've noticed some lag in bracket-matcher's searching. For bracket matcher we could probably even use the async search APIs, since the highlight doesn't need to appear synchronously.

@nathansobo
Copy link
Contributor

@maxbrunsfeld Is this the slowness of Cursor.getCurrentWordBufferRange something you just noticed while profiling or is it actually in a code path related to autocompletion? I'm surprised this method gets called super frequently. I'm excited to see these optimizations coming.

@maxbrunsfeld
Copy link
Contributor Author

Is this the slowness of Cursor.getCurrentWordBufferRange something you just noticed while profiling or is it actually in a code path related to autocompletion?

It's in the code path for autocompletion (with both the old and new providers) because we need to avoid returning the word under the cursor as an autocomplete suggestion.

@nathansobo
Copy link
Contributor

Yee haa!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants