Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: prioritize paths that need to be processed first during bulk renaming #1801

Merged
merged 7 commits into from
Oct 30, 2024

Conversation

yw1ee
Copy link
Contributor

@yw1ee yw1ee commented Oct 17, 2024

Resolve #1766 . Sort bulk-renamed files in a similar way to topological sort.

@yw1ee yw1ee force-pushed the ordering-bulk-rename branch from 6e36b5c to 44b830d Compare October 17, 2024 13:30
@sxyazi
Copy link
Owner

sxyazi commented Oct 17, 2024

Thanks for the PR! Could you please add tests to the sort() function?

@yw1ee
Copy link
Contributor Author

yw1ee commented Oct 18, 2024

Sure! I add a very simple unittest for sort.

@sxyazi sxyazi force-pushed the ordering-bulk-rename branch from 2fa0d6e to b8341e2 Compare October 19, 2024 14:30
@sxyazi
Copy link
Owner

sxyazi commented Oct 19, 2024

Thanks for the change! I added more tests and found that the last two tests are failing. Could you please fix them?

Specifically, the second test is producing unstable sorting results. It could be:

&[("3", "4"), ("2", "3"), ("1", "3")]

or

&[("3", "4"), ("1", "3"), ("2", "3")]

This might be related to the HashMap being used, which has an unpredictable order. Switching it to a BTreeMap or IndexMap might solve the issue.

As for the last test, it hits todo!("Consider cycle"). Normally, a->b and b->a shouldn't happen, but since the data comes from user input, it's an uncontrollable facto, so we need to handle this case to avoid crashing the program here.

@yw1ee
Copy link
Contributor Author

yw1ee commented Oct 20, 2024

I also had some concerns while writing the code, so thank you for creating tests for that. I have two questions.

  • input validation
    • I think the input of second test case is invalid. 1 -> 3 and 2 -> 3 cannot do in one bulk rename. In the sort function, is it enough to just return the order without validation? I think it will be catched at maybe_exist and not be renamed, so is it ok?
  • how to handle error
    • I think it is sub-problem of the first question. As you mentioned, we can handle the error when cyclic rename detected. But I'm not sure about how to handle this invalidate input and what is proper behavior of sort function.

So, could I ask you, the author of this project, to clearly define the roles and responsibilities of the sort function? If it doesn't really matter, I'll work on it at my own discretion.

@sxyazi
Copy link
Owner

sxyazi commented Oct 20, 2024

I haven't fully understood the implementation of sort, but I think we should try to keep the user's input order and only prioritize certain items on top of that, like:

a1 -> b
a2 -> b
b  -> c

In this case, we have a1 -> b and a2 -> b appearing twice, which means the data itself is invalid.

However, we don't need to take responsibility for invalid data; that's the user's responsibility, and the program should only handle valid data, as we already know that b -> c can be prioritized, so the sorted result should be:

b  -> c
a1 -> b
a2 -> b

Invalid data remain unchanged and are passed to the upper layer for handling, which in this case is maybe_exist.

@yw1ee
Copy link
Contributor Author

yw1ee commented Oct 20, 2024

So do you mean that we will rename a valid subset of the user's input even if the user's full input set is invalid (duplicate target name or cyclic rename)? For that purpose, the role of sort function is just prioritize some rename that could be executed without any concerns, right? Please let me know if I have misunderstood anything.

@sxyazi
Copy link
Owner

sxyazi commented Oct 20, 2024

Yes, the sorting function should only elevate those valid renaming items that have no side effects and can be executed early to the front of the execution queue, assisting the user as much as possible (rather than taking over the user's work and responsibilities) in doing all valid operations, while leaving invalid operations for higher-level resolution — here, that means throwing errors to the user for manual intervention.

@yw1ee
Copy link
Contributor Author

yw1ee commented Oct 20, 2024

One thing I would like to suggest is that the sort function returns a separate rename set containing a cycle. And in the above function, if there is a cycle, how about showing the list to the user on stderr to indicate that the cycle exists and just return (or user interaction to do only valid things or not?).

let (valid, cycle) = Self::sort(old, new);
if !cycle.is_empty() {
    writeln!(stderr, "cyclic rename detected");
    for (old, new) in cycle {
        wrintln!(stderr, "{old} -> {new}");
    }
    return Ok(()); // or user interaction to do or not
}

@sxyazi
Copy link
Owner

sxyazi commented Oct 20, 2024

Sorry - no. I don't really see the practical significance of "cycle detection" for users, as it's more of an internal implementation detail. The program just needs to ensure that it safely handles the valid data provided by the user, while leaving the invalid data for the user to handle manually. Introducing this detection seems to discard the entire bulk rename operation, rather than just the invalid parts

@yw1ee
Copy link
Contributor Author

yw1ee commented Oct 20, 2024

I misthought. I thought that if there is a cycle, the rename will be performed and all the files will be messed up, but it will not be performed because it is caught in maybe_exist.

I make result consistent and just return cyclic rename entries instead of panic. Please check it. Thanks for your comment.

@yw1ee yw1ee force-pushed the ordering-bulk-rename branch from 90bb7c3 to cb69c74 Compare October 20, 2024 03:26
@yw1ee yw1ee force-pushed the ordering-bulk-rename branch from cb69c74 to 2128293 Compare October 20, 2024 03:28
@sxyazi sxyazi force-pushed the main branch 2 times, most recently from ae6c872 to 16881aa Compare October 20, 2024 18:59
@sxyazi
Copy link
Owner

sxyazi commented Oct 21, 2024

Thanks! I tested the latest changes, and it seems the ordering issue still exists. For example, this test will fail:

#[rustfmt::skip]
cmp(
    &[("b", "b_"), ("a", "a_"), ("c", "c_")],
    &[("b", "b_"), ("a", "a_"), ("c", "c_")],
);

@yw1ee
Copy link
Contributor Author

yw1ee commented Oct 22, 2024

I miss reserving user input order. I fix it!

@sxyazi sxyazi force-pushed the main branch 5 times, most recently from 9ddaba4 to ef1a31a Compare October 26, 2024 09:09
@sxyazi
Copy link
Owner

sxyazi commented Oct 29, 2024

Hey @yw1ee!

I did some refactoring to simplify the code, could you please do a final review? If everything looks good I'll merge it 🚀

Copy link
Contributor Author

@yw1ee yw1ee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just one simple question. LGTM 👍

yazi-core/src/manager/commands/bulk_rename.rs Show resolved Hide resolved
@sxyazi
Copy link
Owner

sxyazi commented Oct 30, 2024

Merging now.

Thank you so much for contributing this awesome feature @yw1ee, really appreciate it! 🎉

@sxyazi sxyazi changed the title feat: sort the order of bulk-renamed files feat: prioritize paths that need to be processed first during bulk renaming Oct 30, 2024
@sxyazi sxyazi merged commit 7f30231 into sxyazi:main Oct 30, 2024
6 checks passed
sxyazi added a commit that referenced this pull request Oct 30, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Analyze the renaming order of bulk-renamed files to avoid conflicts
2 participants