Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset workspace: why? #1850

Closed
1 task
jorgeorpinel opened this issue Jun 7, 2022 · 9 comments · Fixed by #1900
Closed
1 task

Reset workspace: why? #1850

jorgeorpinel opened this issue Jun 7, 2022 · 9 comments · Fixed by #1900
Labels
A: trees Area: SCM and DVC-tracked trees discussion product PR that affects product

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Jun 7, 2022

Extracted from #1816 (review)

image

  • First, I'm using the example-get-started repo on Codespaces and this button doesn't have an effect at all (bug?)

Most importantly, why do we need this? I understand that "Reset workspace" means
git reset --hard HEAD && git clean -f -d -q && dvc checkout
So that's 2 Git operations in the DVC panel. We already have a button from dvc checkout so why add this shortcut? Probably there's a reason (pros) but here are some more cons in advance:

  • It's unclear what it means.
  • Operation too complex?
  • It's "scary" AND actually dangerous.
  • It can be done by other means (terminal).

Finally, if the Git integration doesn't include git reset (nor git clean), should we?

image

@jorgeorpinel jorgeorpinel added discussion product PR that affects product A: trees Area: SCM and DVC-tracked trees labels Jun 7, 2022
@jorgeorpinel
Copy link
Contributor Author

this button doesn't have an effect

p.s. tried with dirty working tree too.

@mattseddon
Copy link
Member

mattseddon commented Jun 7, 2022

Some background on this command. When I was first developing the SCM tree I spent a lot of time looking at how the git extension operates and copied a lot of its functionality. We intentionally want to integrate with the git tree as we cannot directly change anything that exists within that tree.

You mention that "git reset" is not available as a command. It is but it has a different name:

image

I believe this difference in name is to "hide the complexity of the CLI" which is exactly what we are trying to achieve as well. Pressing the equivalent button in the UI gives this behaviour:

Screen.Recording.2022-06-08.at.7.53.35.am.mov

Selecting "discard" will then reset all of the changes and put the workspace back to the previous commit. Personally, I use this command all the time when I'm developing code.

If you extend this action to accommodate a typical data scientists workflow where they want to discard all of the changes in the workspace (because the results of their experiments are not what they wanted and they want to try something different or change to a different branch) you get this:

Screen.Recording.2022-06-08.at.7.47.53.am.mov

When I initially showed the above to @aguschin he immediately responded with "I just want a single button to perform that action" and that is when "Reset The Workspace" was created...

Screen.Recording.2022-06-08.at.7.56.52.am.mov

Hope this helps to give some context as to why the action is needed.

On your first question:

First, I'm using the example-get-started repo on Codespaces and this button doesn't have an effect at all (bug?)

This is actually intended behaviour and again copies the git tree:

Screen.Recording.2022-06-08.at.8.00.35.am.mov

When there is no need/way to reset the workspace, i.e we can't do a DVC checkout after we reset git I have made the action a no-op. We could potentially hide the icon as well but this is more effort.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jun 8, 2022

Hmm... I see that Discard is in there indeed and this copies that function. OK

IMO we shouldn't copy Git on everything. It's been a recurrent discussion in DVC core in fact, as many things are justified that way but Git is not perfect. Far from it (especially their command and option naming, and docs). Also, why not call it "Discard changes", then? DVC has no "reset" command anyway.

That said I better understand the use case, however I'm still unsure that it's intuitive that the DVC panel changes the state of the Git working tree i.e. pressing "Reset workspace" will result in the Git tree going back to clean. Shouldn't we minimize the overlap between panel functions? (rel. #1849)

I'm using the example-get-started repo on Codespaces and this button doesn't have an effect at all (bug?)

This is actually intended behaviour

But even with changes in the Git working tree (on example-dvc-repo w/ Codespaces), nothing happens.

@mattseddon
Copy link
Member

IMO we shouldn't copy Git on everything. It's been a recurrent discussion in DVC core in fact, as many things are justified that way but Git is not perfect. Far from it (especially their command and option naming, and docs). Also, why not call it "Discard changes", then? DVC has no "reset" command anyway.

Yep, I can rename it if it is necessary.

That said I better understand the use case, however, I'm still unsure that it's intuitive that the DVC panel changes the state of the Git working tree i.e. pressing "Reset workspace" will result in the Git tree going back to clean. Shouldn't we minimize the overlap between panel functions? (rel. #1849)

The options should be there to reset if there are DVC changes. We are trying to speed up people's workflow. Providing shortcuts is a must.

But even with changes in the Git working tree (on example-dvc-repo w/ Codespaces), nothing happens.

The action is only available when we have DVC tracked resources in the following states: added, deleted, modified or renamed. We do this because we don't want to mess with the Git tree unless we have to. In an ideal world, we would simply discard changes for Git tracked files that DVC cares about and then run dvc checkout. The current data provided does not support that solution so we reset everything only when it will change the workspace wrt to DVC changes.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jun 14, 2022

What if there are changes to Git-tracked params or metrics files? That's part of the DVC project and especially important to the extension since it focuses on DVC experiments, yet Reset workspace silently ignores those unless there are changes in DVC-tracked files.

p.s. the official definition of DVC workspace includes all Git-tracked files.

I hope there's a way to see whether users actually use this or whether they ignore or misuse it, because we can argue hypotheticals all day long otherwise 😬

@mattseddon
Copy link
Member

Yes, we have analytics attached to the action. We'll be able to see the usage.

@mattseddon
Copy link
Member

What if there are changes to Git-tracked params or metrics files? That's part of the DVC project and especially important to the extension since it focuses on DVC experiments, yet Reset workspace silently ignores those unless there are changes in DVC-tracked files.

p.s. the official definition of DVC workspace includes all Git-tracked files.

Is the suggestion to make the command more aggressive and always blat the workspace? That would be an easy change to make.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jun 15, 2022

I personally still don't love this button but if we keep,

a) Rename Discard Workspace Changes and always quash Git changes (or even better specifically Git-tracked inputs/outputs);

b) Rename Discard DVC-tracked Changes and only reset DVC-tracked files.

From those options I sort of prefer a' (may be easier once there's full DAG/ pipelining support)

@mattseddon
Copy link
Member

I personally still don't love this button but if we keep,

a) Rename Discard Workspace Changes and always quash Git changes (or even better specifically Git-tracked inputs/outputs);

b) Rename Discard DVC-tracked Changes and only reset DVC-tracked files.

From those options I sort of prefer a' (may be easier once there's full DAG/ pipelining support)

Working on sledgehammer version of A. Can revisit once we get data:status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: trees Area: SCM and DVC-tracked trees discussion product PR that affects product
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants