Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set and Dictionary should have remove(where:). #78364

Open
dabrahams opened this issue Dec 25, 2024 · 14 comments
Open

Set and Dictionary should have remove(where:). #78364

dabrahams opened this issue Dec 25, 2024 · 14 comments
Assignees
Labels
Dictionary Area → standard library: the `Dictionary` type feature A feature request or implementation standard library Area: Standard library umbrella swift evolution proposal needed Flag → feature: A feature that warrants a Swift evolution proposal

Comments

@dabrahams
Copy link
Contributor

Motivation

It's not possible to implement this operation efficiently from outside the library. You either iterate a snapshot and end up silently copying the collection, or you have to collect the items to be removed in a separate collection and then .subtract that.

Proposed solution

Add remove(where:) to Set and Dictionary.

Alternatives considered

No response

Additional information

No response

@dabrahams dabrahams added feature A feature request or implementation triage needed This issue needs more specific labels labels Dec 25, 2024
@natecook1000 natecook1000 added standard library Area: Standard library umbrella swift evolution proposal needed Flag → feature: A feature that warrants a Swift evolution proposal Dictionary Area → standard library: the `Dictionary` type and removed triage needed This issue needs more specific labels labels Jan 7, 2025
@natecook1000
Copy link
Member

Agreed, with the name removeAll(where:) to match the RangeReplaceableCollection requirement.

@krishpranav
Copy link

Hi, is this still open?

@krishpranav
Copy link

@ maintainers if possible can you assign it to me so that i will raise an PR for this.

@AnthonyLatsis
Copy link
Collaborator

@krishpranav Please note that this addition requires a Swift evolution proposal — hence the corresponding label.

@krishpranav
Copy link

so should i need to create a new proposal like that?

@AnthonyLatsis
Copy link
Collaborator

You are welcome to! Make sure to follow the process document that is linked there.

@krishpranav
Copy link

pitch: (swiftlang/swift-evolution#2647)

@stephentyrone
Copy link
Contributor

stephentyrone commented Jan 13, 2025

@dabrahams I've been thinking about the polarity of this operation. Set already has .filter { ... }, so the most obvious spelling here might be something that riffs off that with a negated predicate (but, finding a good name for it is kind of tricky). Would you prefer remove[All](where:), or is that just the name that came to mind?

Also some vague spitballing about how we might implement a set partition efficiently... (@lorentey, @natecook1000?)

@dabrahams
Copy link
Contributor Author

dabrahams commented Jan 13, 2025

I find filter terrible as a name, notwithstanding term-of-art precedent, because it doesn't tell you the polarity of the predicate, and the word even suggests the opposite polarity (we “filter out,” never in) and yes, I do forget from time to time. See also this thread (and probably others). As noted there I would prefer nonPrimes.where(\.isEven) for the positive cases. But as long as there are remove methods I think it's valuable to preserve the mental link for the developer who starts with the idea of removing something. IIRC I objected to the name removeAll for these, because “all” doesn't add anything and even suggests a close relationship to removeAll(), but I lost that battle (or never fought it) and the ship sailed.

I also think it's possibly a good idea to have readable duplicates of functional term-of-art functions. The latter need to be there so people can find them if they think in those terms, but I'm not sure they're helpful to anyone else.

Since you can't reorder the items in a set, it's not clear to me what you might mean by “a set partition.”

@stephentyrone
Copy link
Contributor

stephentyrone commented Jan 13, 2025

I'm asking, somewhat handwavily, if we can partition the set into two sets that leave the element storage as-is, updating one hash table in-place, and creating a new hash table for the "removed" set. We could provide an overload that doesn't bother to create the removed set for your pure in-place removal operation, but also give people a way to get the removed elements efficiently without making them iterate through them.

@lorentey
Copy link
Member

We can definitely provide an operation that splits a set/dictionary into two parts based on some provided predicate, and in a way that's better than manually looping through items.

We do not have tombstones, so we cannot generally avoid rehashing items, even with in-place removals (because we need to compress collision chains). But there is still plenty of opportunity for optimization -- for example, for bulk operations it's been particularly helpful to precalculate predicate results into a temporary bitmap in an up front pass before selecting a strategy. (The bitmap gives us a precise picture of the size of the results, so we can avoid repeated resizings of newly allocated hash table results, and we can heuristically switch between in-place removals and building a new table based on the number of items that remain in place. Bulk in-place removals are often considerably slower than copying the remaining elements directly into a brand new, well-sized hash table. In-place removals can also leave the table wildly oversized, which is sometimes unexpected/wasteful.)

@dabrahams
Copy link
Contributor Author

dabrahams commented Jan 13, 2025

Since your set representation is resilient, it might be possible to leave the underlying storage alone and simply refer to it from something that stores a bitset to tell you which elements are actually present in that partition. Lookup operations would need to check the bitmap to make sure that element is actually in the partition, and the collection index movement operations would change to skip elements not indicated in the bitmap. You could even share a single bitmap allocation, where the bits tell you which partition each element is in.

@lorentey
Copy link
Member

The standard Set/Dictionary types have mostly frozen representations -- they do not have resilient storage. (Although the hash function is opaque, and we have intentionally left a few storage aspects out of the ABI. E.g., the maximum load factor is one knob we can still tweak; but the lookup algorithm is mostly etched in stone.)

(The generic lookup paths needed to remain fully specializable, to avoid shipping a dramatic performance regression in Swift 5.)

@dabrahams
Copy link
Contributor Author

IMO (sadly) if you're going with remove(where:) instead of removeAll(where:) you need to deprecate/rename the other instances of that suboptimal name in the standard library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dictionary Area → standard library: the `Dictionary` type feature A feature request or implementation standard library Area: Standard library umbrella swift evolution proposal needed Flag → feature: A feature that warrants a Swift evolution proposal
Projects
None yet
6 participants