Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete! semantics for vector vs. set #3023

Closed
mlubin opened this issue May 5, 2013 · 24 comments
Closed

delete! semantics for vector vs. set #3023

mlubin opened this issue May 5, 2013 · 24 comments
Labels
needs decision A decision on this change is needed
Milestone

Comments

@mlubin
Copy link
Member

mlubin commented May 5, 2013

There's an issue with the overloaded use of delete! to mean removing both a particular index from a vector and a particular element from a set: vectors partially implement set functions, and some set functions return vectors. See below.

julia> s1 = IntSet(10,12,13);s2 = IntSet(14,15,16);s3 = IntSet();

julia> u = union(union(s1,s2),s3)
IntSet(10, 12, 13, 14, 15, 16)

julia> delete!(u,12)
12

julia> u2 = union(s1,s2,s3)
6-element Any Array:
 10
 12
 13
 14
 15
 16

julia> delete!(u2,12)
ERROR: BoundsError()
 in delete! at array.jl:765

This particular case would be fixed by implementing union(s::IntSet,sets::IntSet...) to return an IntSet, but I think the problem is a bit bigger than this.

@JeffBezanson
Copy link
Member

How about delindex!?

@mlubin
Copy link
Member Author

mlubin commented May 6, 2013

Sounds good to me. delindex! for deleting indices in a vector and keys in a dictionary, and delete! for deleting elements in a set.

delete! could be defined for vectors and dictionaries as removing matching elements as well, but maybe delete!(::Dict,key) and delete!(::Array,i) should be deprecated for a while before suddenly changing the meaning of the function.

@StefanKarpinski
Copy link
Member

Since we probably want the same function for vectors and hashes, this needs a little more thought. Maybe deletekey!?

@mlubin
Copy link
Member Author

mlubin commented May 6, 2013

delindex! is a bit awkward for dicts and deletekey! is a bit awkward for vectors. Maybe they should be separate?

@mlubin
Copy link
Member Author

mlubin commented May 6, 2013

The semantics are also different. After removing a key k from dict d, you can no longer access d[k]. For vectors, the elements are shifted, and the index you deleted could still be valid.

@StefanKarpinski
Copy link
Member

Good point about the rather different semantics. This is all part of the final collections API bikeshedding push we need to do.

@mlubin mlubin mentioned this issue May 8, 2013
9 tasks
@JeffBezanson
Copy link
Member

Stefan and I decided we will rename delete! for Vector to splice!, and add a form splice!(A, out, in) that replaces the indexes out with the values in, e.g.

julia> A = [1,2,3,4,5,6];

julia> splice!(A, 2:3, [7,8,9])
[2, 3]

julia> A
[1,7,8,9,4,5,6]

@JeffBezanson
Copy link
Member

In other words, the default value of the third argument to splice! is an empty array.

@StefanKarpinski
Copy link
Member

We could also support variants where the indices to remove/replace are non-contiguous, although that would obviously be more complicated to implement, but that's kind of why it's useful to have, actually.

@StefanKarpinski
Copy link
Member

Actually, scratch that since when the range isn't contiguous it's ambiguous where to put the replacement values.

@kmsquire
Copy link
Member

kmsquire commented Jun 1, 2013

While I can see the problem highlighted here, I'm wondering if there is a better fix than the one committed?

One of the things I like about julia is that it combines performance with enough common sense that most things are intuitive and just work. I think this change goes against that. delete!() from an array is a common enough operation, and deleting by index is very intuitive. Finding splice!() when I'm looking to delete an element from an array seems potentially quite frustrating.

@johnmyleswhite
Copy link
Member

I agree with Kevin that this is a non-intuitive change of names.

-- John

On May 31, 2013, at 8:36 PM, Kevin Squire [email protected] wrote:

While I can see the problem highlighted here, I'm wondering if there is a better fix than the one committed?

One of the things I like about julia is that it combines performance with enough common sense that most things are intuitive and just work. I think this change goes against that. delete!() from an array is a common enough operation, and deleting by index is very intuitive. Finding splice!() when I'm looking to delete an element from an array seems potentially quite frustrating.


Reply to this email directly or view it on GitHub.

@mlubin
Copy link
Member Author

mlubin commented Jun 1, 2013

I'd agree with @kmsquire that there should be a more intuitive and searchable name for ''splice!'' when two arguments are used.

@JeffBezanson
Copy link
Member

The cool part is that splice! can replace all of push!, pop!, append!, prepend!, shift!, unshift!, insert!. delete!, and empty!. So we could potentially replace much or all of this array-modifying interface with a single general function, which is a bigger deal than just a renaming.

@kmsquire
Copy link
Member

kmsquire commented Jun 1, 2013

I actually like those names for the array-modifying interface--they're familiar and intuitive.

(Reminds me of the debate between unix and perl philosophies...)

@JeffBezanson
Copy link
Member

I don't think I'd actually remove all of them; for one thing, manipulating a single item (as in push!) with splice! requires wrapping it in an array.

It looks like the only other name idea out there is delindex!. Any other ideas?

@kmsquire
Copy link
Member

kmsquire commented Jun 1, 2013

From the original issue:

vectors partially implement set functions, and some set functions return vectors

While having vectors partially implement set functions might seem useful, perhaps that is the real issue here. When I'm using an array, I'm not expecting to be able to intersect it with another array. Why not remove the set functionality from vectors (what functionality is this?), and require users to wrap them in sets (or simply use) sets if they want set functionality.

This particular case would be fixed by implementing union(s::IntSet,sets::IntSet...) to return an IntSet, but I think the problem is a bit bigger than this.

@mlubin, can you elaborate on what the bigger problem is here? Even with the current fix, union(s::IntSet,sets::IntSet...) should return an IntSet.

@mlubin
Copy link
Member Author

mlubin commented Jun 1, 2013

The bigger problem was exactly that delete meant different things for vectors and sets while vectors are sometimes used as sets.

@StefanKarpinski
Copy link
Member

The delete! operation for dicts and sets is fundamentally different than removing a range of indices from an array and shifting all the following values down to those indices. That saw the main issue here. This recognizes that and generalizes the latter operation with the ability to provide replacement values and calls it splice!. The name comes from Perl, in which the splice function does exactly this. We considered expressing this as a[i:j] = values and having deletion be expressed as a[i:j] = [] but decided that this had the potential to hide too many bugs where the left and right sides were meant to be exactly the same size but might accidentally not be.

@kmsquire
Copy link
Member

kmsquire commented Jun 1, 2013

I understand the point about semantics of delete! operations being different between dicts/sets and arrays. They are also quite similar concepts (removing elements from a collection), and humans are pretty good about using the same word to describe different but related concepts, and garnering the actual meaning from context. In this particular case, while the actual semantics are different, it's unclear what to me what is gained by using a different word. While semantically different between dicts/sets and arrays, delete! is perfectly clear for removing elements of either, human's generally won't be confused, and the computer will interpret it just fine... unless the user is assuming that an array is a set.

(On the other hand, having separate names for these makes an array-based implementation of OrderedDicts/OrderedSets easier, since elements can now be removed by index (via splice!) or key (via delete!). I guess that pretty much deflates the previous paragraph and makes Stefan's point. :-) )

Cheers!

@JeffBezanson
Copy link
Member

One possibility is only allowing set operations on OrderedSet(array) rather than on arrays directly. After all, there are other obstacles to using arrays as a drop-in replacement for Sets. For example <= is subset for Set but lexicographic for arrays. And | has been defined as union for Set. It would clearly help matters to change those, and use issubset() instead, and insist on using union and not | for Sets.

@toivoh
Copy link
Contributor

toivoh commented Jun 4, 2013

@StefanKarpinski: Wouldn't it still be very useful to provide a two argument version of splice! with non-contiguous indexes?

@StefanKarpinski
Copy link
Member

Yes, that would make sense and would work. It's a bit weird that the two-argument version can take any enumeration of indices whereas the three argument version can only take a range. It feels kind of strange.

@toivoh
Copy link
Contributor

toivoh commented Jun 4, 2013

I guess that an alternative would be to leave splice! as it is and call the version for any kind of indices delindex!. Not sure if that is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

6 participants