-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort by key #7963
Sort by key #7963
Conversation
Sort straight imports with `sorted_by_cached_key`
I think it describes what it is more accurately and it's more consistent with the isort terminology
Remove all the `cmp_*` functions
PR Check ResultsEcosystem✅ ecosystem check detected no linter changes. |
.unwrap_or_default(); | ||
let force_to_top = name.map(|name| !settings.force_to_top.contains(name)); // `false` < `true` so we get forced to top first | ||
let maybe_lower_case_name = name | ||
.and_then(|name| (!settings.case_sensitive).then_some(NatOrdString(name.to_lowercase()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid allocating here (name.to_lowercase()
) in the event that the string is already lowercase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess to do that we would need to use a Cow
. What we could do is have a single NatOrdStr
struct like this:
use std::borrow::Cow;
struct NatOrdStr<'a>(Cow<'a, str>);
whose contents are either owned or borrowed.
I haven't played around with Cow
s much, so I don't know if the overhead is low enough to make it worthwhile to avoid that allocation in case the string is already lowercase, what do you think? If you think it is I'll gladly tweak my implementation :) but in that case I think it's better to use that Cow
-based struct everywhere rather than just for maybe_lower_case_name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth it given that these are gonna lower-cased in the majority of cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed the modification, I had it ready just in case :)
} else { | ||
natord::compare_ignore_case(alias1.name, alias2.name) | ||
.then_with(|| natord::compare(alias1.name, alias2.name)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this logic still being captured in the refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be because both module_key
and member_key
return a tuple with first maybe_lower_case_name
(which is Some(name.to_lowercase())
if settings.case_sensitive
is false, otherwise None
) and then the module/member name.
So if settings.case_sensitive
is true, then we just compare the names using natord::compare
, and if it's false we first compare the lowercase names with natord::compare
, and then the actual names.
I guess the question is, is comparing two strings that have been turned into lowercase with natord::compare
the same as comparing them directly with natord::compare_ignore_case
? I looked at the source code for natord::compare_ignore_case
and they're just calling to_lowercase
on the char
s individually, but apart from that it seems like what I did should be equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is slightly different, because they also map integers to avoid sorting them lexicographically, right? For example, to avoid putting module10
before module2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought you were talking about the fact that there are two function calls (one case sensitive and one case insensitive).
The correct natrural ordering (typically module2
comes before module10
) is indeed correctly captured in my refactor because I'm relying on natord::compare
to implement the Ord
trait for my types. Was that the question?
If the module name is already lowercase we don't have to call `to_lowercase`, which avoids some allocations (probably a lot since most modules are lowercase)
(Sorry for the delay. This is on me to review and merge, but I'd like to do further testing for correctness since it's a really important code path.) |
2f564f8
to
8163c04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some benchmarking and additional testing -- looks good to me. Thank you, and sorry for the delay here.
Awesome :) Absolutely no need to apologize, I totally understand and I'm very grateful that you took the time to look deeper into my proposal 🙏 Now I can go back to implementing length sort ^^ |
Summary
Refactor for isort implementation. Closes #7738.
I introduced a
NatOrdString
and aNatOrdStr
type to have a naturally orderedString
and&str
, and I pretty much went back to the original implementation based onmodule_key
,member_key
andsorted_by_cached_key
from itertools. I tried my best to avoid unnecessary allocations but it may have been clumsy in some places, so feedback is appreciated! I also renamed thePrefix
enum toMemberType
(and made some related adjustments) because I think this fits more what it is, and it's closer to the wording found in the isort documentation.I think the result is nicer to work with, and it should make implementing #1567 and the like easier :)
Of course, I am very much open to any and all remarks on what I did!
Test Plan
I didn't add any test, I am relying on the existing tests since this is just a refactor.