-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: optimize common case of GlobPath #180
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -105,6 +105,15 @@ func Distance(a, b string, f CostFunc, cut int64) int64 { | |
// * - Any zero or more characters, except for / | ||
// ** - Any zero or more characters, including / | ||
func GlobPath(a, b string) bool { | ||
if !wildcardPrefixMatch(a, b) { | ||
// Fast path. | ||
return false | ||
} | ||
if !wildcardSuffixMatch(a, b) { | ||
// Fast path. | ||
return false | ||
} | ||
|
||
a = strings.ReplaceAll(a, "**", "⁑") | ||
b = strings.ReplaceAll(b, "**", "⁑") | ||
return Distance(a, b, globCost, 1) == 0 | ||
|
@@ -125,3 +134,37 @@ func globCost(ar, br rune) Cost { | |
} | ||
return Cost{SwapAB: 1, DeleteA: 1, InsertB: 1} | ||
} | ||
|
||
// wildcardPrefixMatch compares whether the prefixes of a and b are equal up | ||
// to the shortest one. The prefix is defined as the longest substring that | ||
// starts at index 0 and does not contain a wildcard. | ||
func wildcardPrefixMatch(a, b string) bool { | ||
ai := strings.IndexAny(a, "*?") | ||
bi := strings.IndexAny(b, "*?") | ||
if ai == -1 { | ||
ai = len(a) | ||
} | ||
if bi == -1 { | ||
bi = len(b) | ||
} | ||
mini := min(ai, bi) | ||
return a[:mini] == b[:mini] | ||
} | ||
|
||
// wildcardSuffixMatch compares whether the suffixes of a and b are equal up | ||
// to the shortest one. The suffix is defined as the longest substring that ends | ||
// at the string length and does not contain a wildcard. | ||
func wildcardSuffixMatch(a, b string) bool { | ||
ai := strings.LastIndexAny(a, "*?") | ||
la := 0 | ||
if ai != -1 { | ||
la = len(a) - ai - 1 | ||
} | ||
lb := 0 | ||
bi := strings.LastIndexAny(b, "*?") | ||
if bi != -1 { | ||
lb = len(b) - bi - 1 | ||
} | ||
minl := min(la, lb) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The opposite would be max here. At first I thought this was wrong, again because I had the prefix logic in mind and didn't realize here it's the minimum of the delta, rather than the minimum of the position, per logic above. |
||
return a[len(a)-minl:] == b[len(b)-minl:] | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These algorithms are exact opposites of one another, so I'm missing why a choice was made not to implement them as such as well.
For example, if we look at the prefix function, the opposite would be:
Right? Doing something entirely different besides being more work, it seems, is also unnecessary cognitive load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am probably thinking of the wrong algorithm but I do not think the above will work. We use the indexes for the prefix because they coincide with the prefix length, which is what we really want. Conversely, for the suffixes we want to compare the lengths of the suffixes to get the minimum one and, as far as I know, we cannot do that using indexes without calculating the deltas first because there is no correspondence between index and length. For example:
We know that the indexes are 3 and 6 respectively. If we take the minimum of the position then I do not see how to get to the fact that we should check only the last character of both strings because the minimum length of the suffix is 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must be talking about different things. I'm not suggesting a change in algorithm, but rather just pointing out that the algorithm was laid out differently between the suffix and prefix version. The actual comparison is exactly the same at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am talking about this sentence for example:
I cannot make it like the prefix logic that uses the position because for suffix position != length of the suffix. Meaning I have to use the deltas, unless I am missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you are missing something.
What's the difference between
a[len(a)-minl:]
anda[ai:]
? :)