-
-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checkout of index entries into a directory #315
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth taking a look at the reference in #301 to see what it means for git to handle multi-threading correctly. For one, these probably require different code paths, furthermore special care has to be taken to handle file systems with case folding correctly.
Besides that, it's neat to see what rayon could do, even though ultimately any parallel code has to make due with git_features::parallel
as these allow to turn parallelism off entirely for completely single-threaded builds.
Furthermore I would have expected object ids from entries to be used to load contents to store in destination files, along with a more test-driven approach. For instance, it's unclear how this code gets executed and validated.
As an addition, I think it would be best to reduce the abstraction level to the bare minimum without forcing to create a Furthermore, I think it's beneficial to implement this without threads at first and strive for correctness. From there it should be possible to derive a multi-threaded checkout to reach parity with what For examples on how to avoid dependencies and provide facilities to obtain objects, the Thanks |
Should I retrieve the contents of an |
As mentioned earlier,
For
For handling symlinks and executables I think it's important provide options that mirror options of git itself, which allows to completely ignore the file mode for instance which is typically the case on windows. There is certainly special handling for symlinks on windows as well, along with options that could be passed by the caller. Of course, ultimately there is no way around gating certain functionality depending on the platform or OS, but on top of that git provides options. |
The tests in use git_index::{entry::Mode, State};
use git_odb::{FindExt, Handle};
pub fn copy_index<P: AsRef<Path>>(state: &State, object_handle: &Handle, path: P) -> Result<()> {
let entries = state.entries();
let mut buf = Vec::new();
for entry in entries {
let obj = object_handle.find(entry.id, &mut buf)?;
...
}
...
} |
And that kind of code is ok in tests, but there can't be a dependency of |
8c5e2f4
to
5751a8e
Compare
(Pushed new code.) I have implemented the SYMLINK and FILE modes. TODO:
Am I on the right path ? |
5751a8e
to
82c7867
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick review to provide some guidance, while also pointing out the complexity that is in something that seems deceptively simple at first.
Thus it's probably best to take note of everything that is entailed in checking out a git file, and decide what to implement in this PR and maybe what to push into the next one to avoid long-lived PRs.
I strongly recommend to ditch multi-threading in the first iteration, and push it back once everything else works with a focus on correctness, ideally with an encompassing test for certain special behavior. As this can become complex quickly, putting less in each PR is probably more.
Even though it doesn't help to reduce complexity, it should help scoping this PR by looking at the requirements to checking out an entry coming from the respective code in git. There is a lot of cases to handle and to test for.
82c7867
to
c4ca4bc
Compare
The |
I think here is where the executable bit is set - it's done while opening/creating the file. Maybe something similar is possible with Rust as well.
This should probably be outsourced into a function that is passed as parameter, that task cannot be performed by the Generally I recommend looking and noting all features that are currently missing here (but available in git) maybe along with references to the git source code. That way it's possible to determine how far this PR should go with its implementation, and what could be next to get to an MVP level of correctness and capability. |
c4ca4bc
to
2aeb56b
Compare
Done! Had to modify the tests a bit. I set the executable bit only on unix-based systems. On windows, a regular file is created. |
In order to get this PR on track I think it's best to have an in-person chat on how to proceed, so I reached out to you by email and hope you can make time. Thanks a lot. |
2aeb56b
to
d3b0ea7
Compare
|
I think so, but can't look up the git source right now.
I think it's the other way around, the index entries are adjusted to have the new files stat information. |
What do the |
Yes, it's something that can be converted into |
How do I change the timestamps of the |
Feel free to add a method that allows mutable access. |
Why is the time stored in seconds and nanoseconds as u32 ? The nanoseconds will overflow a u32. The std lib stores and returns them as u64 and u128 respectively. |
That's a great question, and I think it's save to assume that it's the fractional part of each second with nanosecond precision. I assume the question arises from having |
d3b0ea7
to
7f76c4f
Compare
Implemented the skip-worktree and timestamp behaviours. Please take a look. |
7f76c4f
to
91c064e
Compare
The review changes have been fixed and implemented. Please take a look. |
Fixed. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let me resolve conversation items next time so I get to decide if it's resolved or not.
Something you could do to mark them as done is to give it some sort of reaction - that's how I do it in one of my PRs, facilitating the reviewers work who otherwise would have to find and go through resolved comments otherwise before reviewing everything again.
3cdb11d
to
e975b75
Compare
Committed. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's getting there. As a general note, production code should never use unwrap()
.
git-worktree/tests/copy_index/mod.rs
Outdated
for (file1, file2) in repo_files.iter().zip(copy_files.iter()) { | ||
assert!(fs::read(file1)? == fs::read(file2)?); | ||
#[cfg(unix)] | ||
assert!(fs::symlink_metadata(file1)?.mode() & 0b111 << 6 == fs::symlink_metadata(file2)?.mode() & 0b111 << 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is equality, use assert_eq!
. It's odd that clippy doesn't warn about this.
Especially with assertions that aren't entirely obvious to the untrained eye, can you use the third argument to say what it is testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially with assertions that aren't entirely obvious to the untrained eye, can you use the third argument to say what it is testing?
What about the third argument, a message describing what it is supposed to test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: 25a9dc1
e975b75
to
4177d72
Compare
Committed. Please take a look. |
git-worktree/src/lib.rs
Outdated
quick_error! { | ||
#[derive(Debug)] | ||
pub enum Error { | ||
Utf8Error(err: git_object::bstr::Utf8Error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The individual variants never repeat the word Error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed: eaee855
git-worktree/src/lib.rs
Outdated
from() | ||
display("IO error while writing blob or reading file metadata or changing filetype: {}", err) | ||
} | ||
FindOidError(oid: git_hash::ObjectId, path: std::path::PathBuf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, this should be called NotFound { oid: git_hash::ObjectId, path: PathBuf}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed: eaee855
git-worktree/src/lib.rs
Outdated
display("IO error while writing blob or reading file metadata or changing filetype: {}", err) | ||
} | ||
FindOidError(oid: git_hash::ObjectId, path: std::path::PathBuf) { | ||
display("unable to read sha1 file of {} ({})", path.display(), oid.to_hex()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such thing as a SHA1 file
, for all you know an object couldn't be found. This text should be reworded to reflect that.
I like that the path at which the object would be checked out at is mentioned as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same error is used in the git source code you linked: https://github.com/git/git/blob/main/entry.c#L297:L299
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message originates from a commit made 11 years ago, possibly from a time where packs weren't quite a thing yet, nor SHA256. gitoxide
doesn't transcribe git
to Rust, but does its best to improve on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: a4b880c
git-worktree/src/lib.rs
Outdated
let met = std::fs::symlink_metadata(&dest)?; | ||
let ctime = met | ||
.created() | ||
.map_or(Ok(Duration::from_secs(0)), |x| x.duration_since(std::time::UNIX_EPOCH)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be let ctime = met.created()?.duration_since(std::time::UNIX_EPOCH)?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail on Unix as unix does not store ctime
. My implementation just sets the ctime
to 0
if it does not exist as you had previously suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great catch!
I always felt a bit estranged about Ok(Duration::from_secs(0))
which can be transformed into (this time with correct error handling) …
let ctime = met
.created()
.ok()
.map(|x| x.duration_since(std::time::UNIX_EPOCH))
.transpose()?
.unwrap_or_default();
… but it's definitely a bit wordy then.
Let's change Ok(Duration::from_secs(0))
into Ok(Duration::default())
at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: a4b880c
git-worktree/tests/copy_index/mod.rs
Outdated
for (file1, file2) in repo_files.iter().zip(copy_files.iter()) { | ||
assert!(fs::read(file1)? == fs::read(file2)?); | ||
#[cfg(unix)] | ||
assert!(fs::symlink_metadata(file1)?.mode() & 0b111 << 6 == fs::symlink_metadata(file2)?.mode() & 0b111 << 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially with assertions that aren't entirely obvious to the untrained eye, can you use the third argument to say what it is testing?
What about the third argument, a message describing what it is supposed to test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important to stop amending to the same commit, instead, create as many commits on top of this one and push these instead, ideally with more descriptive messages than implemented git-worktree
Every time there is a change I have to look through everything and match all my comments, the time this takes is considerable. This gets frustrating as it seems no great effort is made to fully address all change requests either by applying a fix or by proposing alternatives.
Lastly, it would be great to see this rebased onto the latest main
branch to resolve the conflict.
Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's time to merge main
back and resolve conflicts in the process.
Can't be long now till it's ready for a merge.
git-worktree/src/lib.rs
Outdated
display("IO error while writing blob or reading file metadata or changing filetype: {}", err) | ||
} | ||
FindOidError(oid: git_hash::ObjectId, path: std::path::PathBuf) { | ||
display("unable to read sha1 file of {} ({})", path.display(), oid.to_hex()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message originates from a commit made 11 years ago, possibly from a time where packs weren't quite a thing yet, nor SHA256. gitoxide
doesn't transcribe git
to Rust, but does its best to improve on it.
git-worktree/src/lib.rs
Outdated
let met = std::fs::symlink_metadata(&dest)?; | ||
let ctime = met | ||
.created() | ||
.map_or(Ok(Duration::from_secs(0)), |x| x.duration_since(std::time::UNIX_EPOCH)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great catch!
I always felt a bit estranged about Ok(Duration::from_secs(0))
which can be transformed into (this time with correct error handling) …
let ctime = met
.created()
.ok()
.map(|x| x.duration_since(std::time::UNIX_EPOCH))
.transpose()?
.unwrap_or_default();
… but it's definitely a bit wordy then.
Let's change Ok(Duration::from_secs(0))
into Ok(Duration::default())
at least.
set -eu -o pipefail | ||
|
||
git init -q | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here must be git config commit.gpgsign false
for this to work everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed: a4b880c
git-worktree/tests/copy_index/mod.rs
Outdated
assert_eq!( | ||
fs::symlink_metadata(file1)?.mode() & 0b111 << 6, | ||
fs::symlink_metadata(file2)?.mode() & 0b111 << 6, | ||
"Testing if the permissions of {} and {} are the same", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to prefix with "testing if".
To me it's still unclear what these permissions should mean. Is this a normal file, or an executable? Maybe the mode can be represented in octal to be more meaningful to those who used chmod
on the commandline before?
All prior assert_eq!(…, "msg")
messages should be removed, they are sufficiently clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just checks the bit that represents wether a file is executable or normal and sees if both the files have the same bit.
Changed it to octal: a4b880c
git-worktree/tests/copy_index/mod.rs
Outdated
let repo_files = dir_structure(&path); | ||
let copy_files = dir_structure(output); | ||
|
||
let srepo_files: Vec<_> = repo_files.iter().flat_map(|p| p.strip_prefix(&path)).collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why prefix repo_files
with an s
? What does s
mean? Or asked differently, could these just be inline to avoid having to make up a new name?
If you think it should be clear these are stripped, maybe create a little closure like and use it like this:
assert_eq!(strip_root_path(path, repo_files), strip_root_path(path, copy_files));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlined: a4b880c
Fixed and resolved conflicts! |
Thanks a lot for your first contribution! There is a lot of potential, and I think it's best channeled after I did my due diligence with analyzing and understanding git-worktree myself. This should lead to smaller tasks that are by themselves better understood - please feel free to subscribe to the tracking issue for updates, and I will ping you once I think it's ready to start. I also recommend looking at the crate as it is now. |
I am closing the PR as github seems to not recognize it's already merged. |
@Byron (CC @AP2008): RE: #315 (review): I would strongly recommend If you'd like me to give some more concrete examples of how this workflow might apply here, it might be the motivation I need to finally commit to writing a blog post about it. :) I've been dealing with similar review pain points in my professional work at @NZXTCorp. |
Thanks @ErichDonGubler , that's a great suggestion and a feature I wasn't aware off. A quick search brought me to this article which revealed everything one would want to know on how to use it effectively. As a summary, I have a learned a lot when trying to create PRs myself and when reviewing them, and it's not exactly working perfectly out of the box as it requires some experience for both involved parties. Lastly, I think it would be great if GitHub would automatically detect |
I wish that this was the case, but AFAIK GitHub has no functionality built around a |
No description provided.