-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Use arena-allocated association lists for narrowing constraints #16306
Conversation
CodSpeed Performance ReportMerging #16306 will improve performances by 5.6%Comparing Summary
Benchmarks breakdown
|
|
/// This type provides read-only access to the lists. Use a [`ListBuilder`] to create lists. | ||
#[derive(Debug, Eq, PartialEq)] | ||
pub struct ListStorage<I, K, V> { | ||
cells: IndexVec<I, ListCell<I, K, V>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would make sense to use an arena crate here that avoids moving the vec when resizing and instead simply allocates a new vec, considering that we aren't interested in accessing all items as a slice
This is clever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's an interesting idea! I'd prefer to experiment on that with a separate future PR, since this is already giving a nice performance gain without bringing on an extra dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this and... it's very disappointing haha :D It shows no difference locally.
Subject: [PATCH] Change `iterate` to return a `Result`
---
Index: crates/ruff_index/src/slice.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_index/src/slice.rs b/crates/ruff_index/src/slice.rs
--- a/crates/ruff_index/src/slice.rs (revision b04413079859fd265817c67f184ba7f86e1e92a3)
+++ b/crates/ruff_index/src/slice.rs (date 1740767606877)
@@ -50,6 +50,11 @@
self.raw.len()
}
+ #[inline]
+ pub const fn capacity(&self) -> usize {
+ self.raw.len()
+ }
+
#[inline]
pub const fn is_empty(&self) -> bool {
self.raw.is_empty()
Index: crates/red_knot_python_semantic/src/list.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/red_knot_python_semantic/src/list.rs b/crates/red_knot_python_semantic/src/list.rs
--- a/crates/red_knot_python_semantic/src/list.rs (revision b04413079859fd265817c67f184ba7f86e1e92a3)
+++ b/crates/red_knot_python_semantic/src/list.rs (date 1740767857167)
@@ -61,6 +61,7 @@
//!
//! [alist]: https://en.wikipedia.org/wiki/Association_list
+use std::cell::RefCell;
use std::cmp::Ordering;
use std::marker::PhantomData;
use std::ops::Deref;
@@ -69,7 +70,7 @@
/// A handle to an association list. Use [`ListStorage`] to access its elements, and
/// [`ListBuilder`] to construct other lists based on this one.
-#[derive(Clone, Copy, Debug, Eq, Ord, PartialEq, PartialOrd)]
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub(crate) struct List<K, V = ()> {
last: Option<ListCellId>,
_phantom: PhantomData<(K, V)>,
@@ -95,14 +96,82 @@
}
#[newtype_index]
-#[derive(PartialOrd, Ord)]
+// #[derive(PartialOrd, Ord)]
struct ListCellId;
/// Stores one or more association lists. This type provides read-only access to the lists. Use a
/// [`ListBuilder`] to create lists.
#[derive(Debug, Eq, PartialEq)]
pub(crate) struct ListStorage<K, V = ()> {
- cells: IndexVec<ListCellId, ListCell<K, V>>,
+ cells: ListChunk<K, V>,
+}
+
+impl<K, V> ListStorage<K, V> {
+ fn new() -> Self {
+ Self {
+ cells: ListChunk {
+ current: Vec::with_capacity(16),
+ rest: Vec::new(),
+ },
+ }
+ }
+
+ fn push(&mut self, cell: ListCell<K, V>) -> ListCellId {
+ self.push_fast(cell)
+ .unwrap_or_else(|cell| self.push_slow(cell))
+ }
+
+ #[inline]
+ fn push_fast(&mut self, cell: ListCell<K, V>) -> Result<ListCellId, ListCell<K, V>> {
+ let len = self.cells.current.len();
+
+ if len < self.cells.current.capacity() {
+ let list_len = self.cells.current.len();
+ self.cells.current.push(cell);
+ let chunk = self.cells.rest.len() << 28;
+ Ok(ListCellId::from_usize(list_len | chunk))
+ } else {
+ Err(cell)
+ }
+ }
+
+ fn push_slow(&mut self, cell: ListCell<K, V>) -> ListCellId {
+ let mut next_vec = Vec::with_capacity(self.cells.current.capacity() * 2);
+ next_vec.push(cell);
+
+ let current = std::mem::replace(&mut self.cells.current, next_vec);
+ self.cells.rest.push(current);
+
+ let chunk = self.cells.rest.len() << 28;
+ ListCellId::from_usize(chunk)
+ }
+}
+
+impl<K, V> std::ops::Index<ListCellId> for ListStorage<K, V> {
+ type Output = ListCell<K, V>;
+
+ fn index(&self, index: ListCellId) -> &Self::Output {
+ let offset = index.as_u32();
+
+ let chunk = offset >> 28;
+ let list_id = offset & !(0b1111 << 28);
+ let list = self
+ .cells
+ .rest
+ .get(chunk as usize)
+ .unwrap_or(&self.cells.current);
+
+ &list[list_id as usize]
+ }
+}
+
+#[newtype_index]
+struct ChunkId;
+
+#[derive(Debug, Eq, PartialEq, Default)]
+struct ListChunk<K, V = ()> {
+ current: Vec<ListCell<K, V>>,
+ rest: Vec<Vec<ListCell<K, V>>>,
}
/// Each association list is represented by a sequence of snoc cells. A snoc cell is like the more
@@ -151,9 +220,7 @@
impl<K, V> Default for ListBuilder<K, V> {
fn default() -> Self {
ListBuilder {
- storage: ListStorage {
- cells: IndexVec::default(),
- },
+ storage: ListStorage::new(),
scratch: Vec::default(),
}
}
@@ -170,7 +237,8 @@
/// Finalizes a `ListBuilder`. After calling this, you cannot create any new lists managed by
/// this storage.
pub(crate) fn build(mut self) -> ListStorage<K, V> {
- self.storage.cells.shrink_to_fit();
+ self.storage.cells.current.shrink_to_fit();
+ self.storage.cells.rest.shrink_to_fit();
self.storage
}
@@ -182,7 +250,7 @@
/// list.
#[allow(clippy::unnecessary_wraps)]
fn add_cell(&mut self, rest: Option<ListCellId>, key: K, value: V) -> Option<ListCellId> {
- Some(self.storage.cells.push(ListCell { rest, key, value }))
+ Some(self.storage.push(ListCell { rest, key, value }))
}
/// Returns an entry pointing at where `key` would be inserted into a list.
@@ -210,7 +278,7 @@
// (and any succeeding keys) onto.
let mut curr = list.last;
while let Some(curr_id) = curr {
- let cell = &self.storage.cells[curr_id];
+ let cell = &self.storage[curr_id];
match key.cmp(&cell.key) {
// We found an existing entry in the input list with the desired key.
Ordering::Equal => {
@@ -339,8 +407,8 @@
let mut a = a.last;
let mut b = b.last;
while let (Some(a_id), Some(b_id)) = (a, b) {
- let a_cell = &self.storage.cells[a_id];
- let b_cell = &self.storage.cells[b_id];
+ let a_cell = &self.storage[a_id];
+ let b_cell = &self.storage[b_id];
match a_cell.key.cmp(&b_cell.key) {
// Both lists contain this key; combine their values
Ordering::Equal => {
@@ -390,7 +458,7 @@
type Item = &'a K;
fn next(&mut self) -> Option<Self::Item> {
- let cell = &self.storage.cells[self.curr?];
+ let cell = &self.storage[self.curr?];
self.curr = cell.rest;
Some(&cell.key)
}
@@ -531,7 +599,7 @@
type Item = (&'a K, &'a V);
fn next(&mut self) -> Option<Self::Item> {
- let cell = &self.storage.cells[self.curr?];
+ let cell = &self.storage[self.curr?];
self.curr = cell.rest;
Some((&cell.key, &cell.value))
}
I suspect that our lists just aren't big enough for it to matter
#[derive(Debug, Eq, PartialEq)] | ||
struct ListCell<I, K, V>(K, V, Option<I>); | ||
|
||
/// Stores one or more _association lists_, which are linked lists of key/value pairs. We |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key/value pairs
Technically we don't need full key/value pairs for narrowing constraints (note that the value type is ()
). But I plan on using this in a follow-on PR to also replace the SmallVec
s of live bindings and declarations in SymbolState
, which will have both keys and values to store.
/// | ||
/// [`Constraint`]: crate::semantic_index::constraint::Constraint | ||
#[derive(Clone, Copy, Debug, Eq, Ord, PartialEq, PartialOrd)] | ||
pub(crate) struct ScopedNarrowingConstraintClause(ScopedConstraintId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might seem like an unnecessary extra wrapper, but I have hopes to move is_positive
from Constraint
to here, so I wanted to go ahead and have this type available as the place to move it to. Also I think it adds clarity to the distinction between constraints (the expressions in the source code), and narrowing/visibility constraints (the two ways in which we're using those).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really fantastic!! I have literally no notes on the actual implementation strategy, just on comments and terminology :)
crates/ruff_index/src/list.rs
Outdated
/// Stores one or more _association lists_, which are linked lists of key/value pairs. We | ||
/// additionally guarantee that the elements of an association list are sorted (by their keys), and | ||
/// that they do not contain any entries with duplicate keys. | ||
/// | ||
/// Association lists have fallen out of favor in recent decades, since you often need operations | ||
/// that are inefficient on them. In particular, looking up a random element by index is O(n), just | ||
/// like a linked list; and looking up an element by key is also O(n), since you must do a linear | ||
/// scan of the list to find the matching element. The typical implementation also suffers from | ||
/// poor cache locality and high memory allocation overhead, since individual list cells are | ||
/// typically allocated separately from the heap. | ||
/// | ||
/// We solve that last problem by storing the cells of an association list in an [`IndexVec`] | ||
/// arena. You provide the index type (`I`) that you want to use with this arena. That means that | ||
/// an individual association list is represented by an `Option<I>`, with `None` representing an | ||
/// empty list. | ||
/// | ||
/// We exploit structural sharing where possible, reusing cells across multiple lists when we can. | ||
/// That said, we don't guarantee that lists are canonical — it's entirely possible for two lists | ||
/// with identical contents to use different list cells and have different identifiers. | ||
/// | ||
/// This type provides read-only access to the lists. Use a [`ListBuilder`] to create lists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. One thing I noticed both in the PR summary and here is that we discuss limitations of this data structure that might mean it's not suitable for some tasks, and how this implementation mitigates some limitations, but we really don't discuss (apart from a brief mention of structural sharing) what the advantages of the data structure are (what kind of tasks is it particularly well-suited for), or (and this might not belong in the this comment here, but does belong somewhere) why it is well-suited for narrowing constraints in the use-def map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a bit more text about this, lmkwyt
crates/ruff_index/src/list.rs
Outdated
/// Finds the entry in a list with a given key, and returns its value. | ||
/// | ||
/// **Performance**: Note that lookups are O(n), since we use a linked-list representation! | ||
pub fn get(&self, list: Option<I>, key: &K) -> Option<&V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be used in this PR, and I wonder if we should even provide it, given that if you need it this might not be the data structure for you?
Though IIRC maybe one of your future PRs will try using this for a case where we do sometimes need random access, but not too frequently and where cardinality shouldn't be too high, either? That's probably a good case for having it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a good point... it's also not currently being tested. I went ahead and removed it, since I can add it back as part of that experiment to see if this will be fast enough for the per-symbol lists too.
crates/ruff_index/src/list.rs
Outdated
let new_key = curr_key.clone(); | ||
let new_value = curr_value.clone(); | ||
let new_tail = self.insert(*tail, key, value); | ||
self.add_cell(new_key, new_value, new_tail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we required values to implement Eq
, then we could consider the case where the new key already exists in the list with the same value and no updates are needed, so we could avoid needlessly cloning all cells up to the matching key.
This would require an internal recursive insert
that returns an extra bit of information ("did anything change") wrapped by the public insert
.
No idea if this would be worth it in practice for our uses. I guess for the case where we know all values are equal (because they are all ()
) we can just as well use insert_if_needed
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done! @MichaReiser's suggestion to add an entry API gave us the internal helper function that you mention
//! A _narrowing constraint_ consists of a list of _clauses_, each of which corresponds with an | ||
//! expression in the source file (represented by a [`Constraint`]). We need to support the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a bit of a terminology problem here. In this one sentence (and throughout the comments in this file, and code changes throughout this PR) we now have two distinct uses of the term "constraint" to refer to either a) a single test expression which may constrain the type of some binding, or b) a list of test expressions ("clauses"), all of which constrain the type of a binding at a certain point in control flow.
I think we should make a clear decision about which one of these we will use the term "constraint" for, and stick to it. Prior to this PR, the intent was always that a "constraint" (represented by a Constraint) is a single test expression, and a binding at some point in control flow can be associated with multiple narrowing constraints, plural. I realize that's awkward because now you are introducing an arena which contains multiple sets of narrowing constraints, and you want to call that "narrowing_constraints", which shifts all the plurals down one level :)
I'm not opposed to shifting our use of terms such that we say that "a narrowing constraint" (singular) is a list of constraining test expressions, or clauses, or predicates. But if we do that, I think we need to be consistent about it, and IMO that means renaming the Constraint
ingredient and the ScopedConstraintId
index to use some other term ("constraint clause" or "predicate"?)
(Open to doing this further rename pass as a separate PR, to minimize the need for re-review of this one.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and you want to call that "narrowing_constraints", which shifts all the plurals down one level
Yes I agree with this completely. In the commit history I think you can see the point where I realized I needed to name something narrowing_constraintses
😂
I'm not opposed to shifting our use of terms such that we say that "a narrowing constraint" (singular) is a list of constraining test expressions, or clauses, or predicates. But if we do that, I think we need to be consistent about it, and IMO that means renaming the
Constraint
ingredient and theScopedConstraintId
index to use some other term ("constraint clause" or "predicate"?)(Open to doing this further rename pass as a separate PR, to minimize the need for re-review of this one.)
I like this suggestion of Constraint
→ Predicate
— that would allow us to use them for things other than "constraining" in the future, should we need to. (Or maybe put better, the name would suggest or remind us that we could do that)
crates/red_knot_python_semantic/src/semantic_index/narrowing_constraints.rs
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/src/semantic_index/narrowing_constraints.rs
Outdated
Show resolved
Hide resolved
@@ -35,7 +35,7 @@ use std::sync::Arc; | |||
/// | |||
/// But if we called this with the same `test` expression, but the `definition` of `y`, no | |||
/// constraint is applied to that definition, so we'd just return `None`. | |||
pub(crate) fn narrowing_constraint<'db>( | |||
pub(crate) fn infer_narrowing_constraint<'db>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
let mut constraint_tys = narrowing_constraint | ||
.filter_map(|constraint| infer_narrowing_constraint(db, constraint, binding)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming back to the terminology problem: here we iterate over narrowing_constraint
, and the iteration yields something we also call a constraint
, which we then pass to a function named infer_narrowing_constraint
.
As an aside, wow, an actual use case where linked lists are actually useful for once outside of memory management! And are actually more performant! |
crates/red_knot_python_semantic/src/semantic_index/narrowing_constraints.rs
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/src/semantic_index/narrowing_constraints.rs
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/src/semantic_index/narrowing_constraints.rs
Outdated
Show resolved
Hide resolved
/// poor cache locality and high memory allocation overhead, since individual list cells are | ||
/// typically allocated separately from the heap. | ||
/// | ||
/// We solve that last problem by storing the cells of an association list in an [`IndexVec`] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that poor cache locality is fully solved by using an arena. A program that always inserts one element per list before inserting another element per list will lead to a similar level of fragementation but the arena now comes at the cost of an extra level of indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I understand: you're talking about a list whose cells don't appear consecutively in the arena, is that right? I think you're right that that wouldn't give us ideal cache locality, but it should still be better than heap-allocated cells, since all of the cells are still in a single contiguous region of memory.
If you're inserting elements into two different lists in an interleaving pattern, then the cells of those two lists will be interleaved, but I think iterating over either of them would still be faster — you'd just have a stride of 2 instead of 1. (Actually -2 instead of -1, since the list cells will end up in the arena in reverse order)
crates/ruff_index/src/list.rs
Outdated
match a_key.cmp(b_key) { | ||
// Both lists contain this key; combine their values | ||
Ordering::Equal => { | ||
let new_key = a_key.clone(); | ||
let new_value = combine(a_value, b_value); | ||
let new_tail = self.intersect(*a_tail, *b_tail, combine); | ||
self.add_cell(new_key, new_value, new_tail) | ||
} | ||
// a's key is only present in a, so it's not included in the result. | ||
Ordering::Less => self.intersect(*a_tail, b, combine), | ||
// b's key is only present in b, so it's not included in the result. | ||
Ordering::Greater => self.intersect(a, *b_tail, combine), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using recursion here could be problematic for long lists. Could we rewrite the function to use an iterator/loop instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Note that this required adding a scratch-space accumulator to the builder. (Implementing these operations iteratively reverses the list, so we do that into a scratch vec, and which we then pop new entries off of in reverse to maintain the correct ordering)
let constraint_tys: Vec<_> = narrowing_constraint | ||
.filter_map(|constraint| infer_narrowing_constraint(db, constraint, binding)) | ||
.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the easiest way to un-reverse the narrowing constraints. I had tried adding a reverse
to IntersectionBuilder
, but that ran into problems where there were nested intersections — multiple narrowing constraint clauses some of which are themselves intersections. It would end up reversing the combined result, when you really only want to (un)reverse the outer intersection. In local testing, this didn't seem to affect performance even though we're collecting into a vec. (Presumably because all of the type builders are themselves allocating a fair bit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to clarify, in a previous iteration of this PR, we were using cons lists, which meant that the iterator was returning elements in order, and so we didn't need to worry about reversing anything here.
But that meant that you needed to construct lists in reverse order to get the best efficiency, and it seemed easier to figure out how to reverse the output here instead of how to process the constraints in reverse order during semantic index building.
(Also, these, er, constraints are all mentioned in the doc comments)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one last question here. If I'm understanding all this right, it suggests that in the previous version of this diff, we were constructing our lists in the most inefficient order (because we were doing a forward traversal of the IDs and constructing in-order, but using cons lists internally), so our construction was quadratic? But now it's linear.
And yet this version of the PR has roughly the same performance impact as the previous version (around 5%). Does this mostly suggest that our constraint lists tend to be so small that quadratic vs linear doesn't even really show up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yet this version of the PR has roughly the same performance impact as the previous version (around 5%). Does this mostly suggest that our constraint lists tend to be so small that quadratic vs linear doesn't even really show up?
Yes, I think that's the right conclusion. I was testing locally on black, which I thought would be large enough for that quadratic difference to show up. I haven't rebased #16311 onto this PR's updates yet, but my (completely unsubstantiated) hope is that's where we were seeing the performance difference. Since it seems more possible that black and tomllib have larger lists of live bindings than lists of active narrowing constraints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I had added some printfs to verify that we are indeed visiting constraints in numeric order, and that we do produce a much smaller number of total list cells with the snoc list change)
.filter_map(|constraint| infer_narrowing_constraint(db, constraint, binding)) | ||
.peekable(); | ||
.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous version here was aiming to avoid the allocation for this collect? But benchmark suggests it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's right, see https://github.com/astral-sh/ruff/pull/16306/files#r1968481090. (Two ships comments passing in the night!)
Co-authored-by: Carl Meyer <[email protected]>
In #16306 (comment), @carljm pointed out that #16306 introduced a terminology problem, with too many things called a "constraint". This is a follow-up PR that renames `Constraint` to `Predicate` to hopefully clear things up a bit. So now we have that: - a _predicate_ is a Python expression that might influence type inference - a _narrowing constraint_ is a list of predicates that constraint the type of a binding that is visible at a use - a _visibility constraint_ is a ternary formula of predicates that define whether a binding is visible or a statement is reachable This is a pure renaming, with no behavioral changes.
* main: (38 commits) [red-knot] Use arena-allocated association lists for narrowing constraints (#16306) [red-knot] Rewrite `Type::try_iterate()` to improve type inference and diagnostic messages (#16321) Add issue templates (#16213) Normalize inconsistent markdown headings in docstrings (#16364) [red-knot] Better diagnostics for method calls (#16362) [red-knot] Add argfile and windows glob path support (#16353) [red-knot] Handle pipe-errors gracefully (#16354) Rename `venv-path` to `python` (#16347) [red-knot] Fixup some formatting in `infer.rs` (#16348) [red-knot] Restrict visibility of more things in `class.rs` (#16346) [red-knot] Add diagnostic for class-object access to pure instance variables (#16036) Add `per-file-target-version` option (#16257) [PLW1507] Mark fix unsafe (#16343) [red-knot] Add a test to ensure that `KnownClass::try_from_file_and_name()` is kept up to date (#16326) Extract class and instance types (#16337) Re-order changelog entries for 0.9.7 (#16344) [red-knot] Add support for `@classmethod`s (#16305) Update Salsa (#16338) Update Salsa part 1 (#16340) Upgrade Rust toolchain to 1.85.0 (#16339) ...
This PR adds an implementation of association lists, and uses them to replace the previous
BitSet
/SmallVec
representation for narrowing constraints.An association list is a linked list of key/value pairs. We additionally guarantee that the elements of an association list are sorted (by their keys), and that they do not contain any entries with duplicate keys.
Association lists have fallen out of favor in recent decades, since you often need operations that are inefficient on them. In particular, looking up a random element by index is O(n), just like a linked list; and looking up an element by key is also O(n), since you must do a linear scan of the list to find the matching element. Luckily we don't need either of those operations for narrowing constraints!
The typical implementation also suffers from poor cache locality and high memory allocation overhead, since individual list cells are typically allocated separately from the heap. We solve that last problem by storing the cells of an association list in an
IndexVec
arena.