Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic implementation of the string pattern API #22466

Merged
merged 9 commits into from
Feb 23, 2015

Conversation

Kimundi
Copy link
Member

@Kimundi Kimundi commented Feb 17, 2015

This is not a complete implementation of the RFC:

  • only existing methods got updated, no new ones added
  • doc comments are not extensive enough yet
  • optimizations got lost and need to be reimplemented

See rust-lang/rfcs#528

Technically a

[breaking-change]

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@Kimundi
Copy link
Member Author

Kimundi commented Feb 17, 2015

r? @aturon

@rust-highfive rust-highfive assigned aturon and unassigned nikomatsakis Feb 17, 2015

#[inline]
fn match_starts_at(self, haystack: &'a str, idx: usize) -> bool {
let mut matcher = self.into_searcher(haystack);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be self.into_searcher(haystack[idx..]), and then the loop becomes unnecessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, indeed that could work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, actually, it may not. E.g. asking if a match for abab starts at 2 in abababab should return false, since the matches are indices 0-4 and 4-8, or does the searching look for all matching substrings, even if they overlap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, tricky!

Its true that as defined only all non-overlapping matches are valid, which means you can not slice the haystack as a shortcut. But that also means any potential implementer of this default method could not do that trick either, which makes the method kinda useless to even provide, so I think I have to change something:

  • Either allow those methods to give answers that don't need to be identical to the yielded indices of the searcher.
  • Remove the index argument and turn them into something like is_front_match() and is_end_match(), relying on the caller of this method to slice the haystack beforehand.

Both seem fine to me, but the latter is more minimal, so I'll probably go with that one. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the latter approach. As you say, it's more minimal, and I think it's also more clear.

@aturon
Copy link
Member

aturon commented Feb 18, 2015

OK, I've taken a review pass and this looks basically good to me. There's a lot of detail work -- adding documentation, cleanup, etc -- but I think the core functionality is in reasonable shape to land. Thanks @Kimundi!

@@ -677,6 +677,8 @@ macro_rules! iterator {
fn next(&mut self) -> Option<$elem> {
// could be implemented with slices, but this avoids bounds checks
unsafe {
::intrinsics::assume(!self.ptr.is_null());
::intrinsics::assume(!self.end.is_null());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this duplicating the work of #21886 as well? (maybe leave to #21886?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked with dotdash about - its not exactly duplication because the assumes are at a different location, and depending how the code will be inlined there might be cases where either of the two changes might not apply. Also, giving llvm multiple optimization hints can't really hurt.

However, currently these hints are unneeded for my code anyway, because the optimized codepaths that depended on them did not end up in the current iteration of this PR, so if you'd rather have them removed for now I can do that too.

Added a few bugfixes and additional testcases
@Kimundi Kimundi force-pushed the str_pattern_ai_safe branch from 505d7ca to c8dd2d0 Compare February 19, 2015 23:58
@aturon
Copy link
Member

aturon commented Feb 20, 2015

@bors: r+ c8dd2d0 p=1

@bors
Copy link
Contributor

bors commented Feb 20, 2015

⌛ Testing commit c8dd2d0 with merge 41fb6ac...

@bors
Copy link
Contributor

bors commented Feb 20, 2015

💔 Test failed - auto-win-32-nopt-t

@alexcrichton
Copy link
Member

@bors: retry

@bors
Copy link
Contributor

bors commented Feb 20, 2015

⌛ Testing commit c8dd2d0 with merge 562ee80...

@bors
Copy link
Contributor

bors commented Feb 20, 2015

💔 Test failed - auto-win-32-nopt-t

@Kimundi
Copy link
Member Author

Kimundi commented Feb 20, 2015

Hm, those failures seem unrelated to the PR?

@Kimundi
Copy link
Member Author

Kimundi commented Feb 20, 2015

Unless that timeout error is because of the slower string search...

@Kimundi Kimundi changed the title WIP Implement the string pattern API Basic implementation of the string pattern API Feb 20, 2015
@alexcrichton
Copy link
Member

@bors: retry

Looks unrelated

@bors
Copy link
Contributor

bors commented Feb 20, 2015

⌛ Testing commit c8dd2d0 with merge cf04813...

@bors
Copy link
Contributor

bors commented Feb 20, 2015

💔 Test failed - auto-mac-64-nopt-t

@alexcrichton
Copy link
Member

@bors: retry

@bors
Copy link
Contributor

bors commented Feb 20, 2015

⌛ Testing commit c8dd2d0 with merge e4fff79...

@bors
Copy link
Contributor

bors commented Feb 21, 2015

💔 Test failed - auto-linux-64-opt

@Manishearth
Copy link
Member

---- [pretty] run-pass/cci_capture_clause.rs stdout ----

error: pretty-printed source does not typecheck
status: exit code: 101
command: x86_64-unknown-linux-gnu/stage2/bin/rustc - -Zno-trans --crate-type=lib --target=x86_64-unknown-linux-gnu -L x86_64-unknown-linux-gnu/test/run-pass -L x86_64-unknown-linux-gnu/test/run-pass/cci_capture_clause.stage2-x86_64-unknown-linux-gnulibaux --cfg rtopt --cfg debug -O -L x86_64-unknown-linux-gnu/rt

Could you see if make check-stage1-rpass-pretty TESTNAME=cci_capture_clause passes locally (might be pretty-rpass), and if there's an easy way to fix the error? (Otherwise we might have to ignore-pretty the test)

@Kimundi
Copy link
Member Author

Kimundi commented Feb 22, 2015

make check-stage2-pretty-rpass TESTNAME=cci_capture_clause passes locally.

(The stage1 one gives me staging compilation errors about Hash not having enough arguments in libtest.)

@Manishearth
Copy link
Member

@bors: retry

Hm.

bors added a commit that referenced this pull request Feb 22, 2015
This is not a complete implementation of the RFC:

- only existing methods got updated, no new ones added
- doc comments are not extensive enough yet
- optimizations got lost and need to be reimplemented

See rust-lang/rfcs#528

Technically a

[breaking-change]
@bors
Copy link
Contributor

bors commented Feb 22, 2015

⌛ Testing commit c8dd2d0 with merge 67eb38e...

@bors bors merged commit c8dd2d0 into rust-lang:master Feb 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants