Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Iterator#slice_after #7146

Merged
merged 1 commit into from
Dec 5, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions spec/std/iterator_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -758,4 +758,72 @@ describe Iterator do
iter.rewind.to_a.should eq([1, 2, 2, 3, 3])
end
end

describe "#slice_after" do
it "slices after" do
ary = [1, 3, 5, 8, 10, 11, 13, 15, 16, 17]
iter = ary.slice_after(&.even?)
iter.next.should eq([1, 3, 5, 8])
iter.next.should eq([10])
iter.next.should eq([11, 13, 15, 16])
iter.next.should eq([17])
iter.next.should be_a(Iterator::Stop)
end

it "slices after: #to_a" do
ary = [1, 3, 5, 8, 10, 11, 13, 15, 16, 17]
ary.slice_after(&.even?).to_a.should eq([
[1, 3, 5, 8],
[10],
[11, 13, 15, 16],
[17],
])
end

it "slices after: #rewind" do
ary = [1, 3, 5, 8, 10, 11, 13, 15, 16, 17]
iter = ary.slice_after(&.even?)
iter.next.should eq([1, 3, 5, 8])
iter.next.should eq([10])

iter.rewind
iter.next.should eq([1, 3, 5, 8])
end

it "slices after with reuse = true" do
ary = [1, 3, 5, 8, 10, 11, 13, 15, 16, 17]
iter = ary.slice_after(reuse: true, &.even?)
a = iter.next
a.should eq([1, 3, 5, 8])

b = iter.next
b.should eq([10])

a.should be(b)
end

it "slices after with reuse = array" do
reuse = [] of Int32
ary = [1, 3, 5, 8, 10, 11, 13, 15, 16, 17]
iter = ary.slice_after(reuse: reuse, &.even?)
a = iter.next
a.should eq([1, 3, 5, 8])

b = iter.next
b.should eq([10])

a.should be(b)
a.should be(reuse)
end

it "slices after: non-bool block" do
ary = [1, nil, nil, 2, 3, nil]
iter = ary.slice_after(&.itself)
iter.next.should eq([1])
iter.next.should eq([nil, nil, 2])
iter.next.should eq([3])
iter.next.should eq([nil])
iter.next.should be_a(Iterator::Stop)
end
end
end
5 changes: 5 additions & 0 deletions src/iterable.cr
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,9 @@ module Iterable(T)
def each_with_object(obj)
each.with_object(obj)
end

# Same as `each.slice_after(reuse, &block)`.
def slice_after(reuse : Bool | Array(T) = false, &block : T -> B) forall B
each.slice_after(reuse, &block)
end
end
87 changes: 87 additions & 0 deletions src/iterator.cr
Original file line number Diff line number Diff line change
Expand Up @@ -1290,4 +1290,91 @@ module Iterator(T)
self
end
end

# Returns an iterator for each chunked elements where the ends
# of chunks are defined by the block, when the block's value
# is _truthy_. Thus, the enumerable is sliced just **after** each
# time the block returns a _truthy_ value.
#
# For example, to get chunks that end at each uppercase letter:
#
# ```
# ary = ['a', 'b', 'C', 'd', 'E', 'F', 'g', 'h']
# iter = ary.slice_after(&.uppercase?)
# iter.next # => ['a', 'b', 'C']
# iter.next # => ['d', 'E']
# iter.next # => ['F']
# iter.next # => ['g', 'h']
# iter.next # => Iterator::Stop
# ```
#
# By default, a new array is created and yielded for each slice when invoking `next`.
# * If *reuse* is `false`, the method will create a new array for each chunk
# * If *reuse* is `true`, the method will create a new array and reuse it.
# * If *reuse* is an `Array`, that array will be reused
#
# This can be used to prevent many memory allocations when each slice of
# interest is to be used in a read-only fashion.
def slice_after(reuse : Bool | Array(T) = false, &block : T -> B) forall B
SliceAfter(typeof(self), T, B).new(self, block, reuse)
end

# :nodoc:
class SliceAfter(I, T, B)
include Iterator(Array(T))

def initialize(@iterator : I, @block : T -> B, reuse)
@end = false
@clear_on_next = false

if reuse
if reuse.is_a?(Array)
@values = reuse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't @clear_on_next = true here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... you are supposed to pass an empty array as a reuse value. It's not documented what happens if you pass an array that has some values already, but in this case they will be part of the first chunk. Maybe that's good. I wouldn't worry about this now.

Copy link
Contributor

@RX14 RX14 Dec 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obvious usecase for passing an array as reuse is when you've just used the array from another iterator, meaning the array might not be empty. I don't think it's good to pass the burden of calling clear onto the caller, since it's easy to forget. Also, clear on an already-empty array is very cheap, so why not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you want to get chunks but want the first chunk to have some elements beforehand? If the iterator calls clear there's no way you can do that.

I think we are just inventing use cases. Whenever I used reuse I passed a boolean. We can maybe remove the ability to pass an array altogether.

Copy link
Contributor

@RX14 RX14 Dec 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want the first chunk to have extra elements:

  1. that's pretty inflexible, since you can only do it with the first chunk, not the last
  2. using a named arg called reuse: is not the most clear way to achieve this.

For me, reuse is for avoiding memory allocations only. That's it's only usecase. Always calling clear makes that usecase easier with very small extra expense.

else
@values = [] of T
end
@reuse = true
else
@values = [] of T
@reuse = false
end
end

def next
return stop if @end

if @clear_on_next
@values.clear
@clear_on_next = false
end

while true
value = @iterator.next

if value.is_a?(Stop)
@end = true
if @values.empty?
return stop
else
return @reuse ? @values : @values.dup
end
end

@values << value

if @block.call(value)
@clear_on_next = true
return @reuse ? @values : @values.dup
end
end
end

def rewind
@iterator.rewind
@values.clear
@end = false
@clear_on_next = false
self
end
end
end