-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt patch behaviour for LF/CRLF compatibility #3349
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, thanks, that probably wasn't very fun. It makes me start wondering, though, wether at this stage we wouldn't be better off reimplementing patch
rather than translating the patch file ? Would that actually be much more work ?
I'd also like to discuss the rewriting for calculating hashes. Maybe there is no other solution, but this somehow feels wrong... Is that how openssl
does it ?
src/core/opamHash.ml
Outdated
let probably_binary name = | ||
List.mem (Filename.extension name) [".zip"; ".tar"; ".gz"; ".tgz"; ".tbz"; ".txz"; ".tlz"] | ||
in | ||
if OpamStd.Option.map_default ((=) primary) true target || probably_binary file then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like the Std.Option
combinators, but target = None || target = Some primary
would probably be easier to read here ^^.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh all right, then 🙂
13b98af
to
8332bf8
Compare
Finally, some notes on this! The first issue is the hashing. openssl has nothing to solve - the files are fundamentally different here because (usually) of Git. Internet protocols typically use CRLF (email, http, etc.), so this doesn't come up since the standards already declare the data to be hashed as being supposed to have CRLF line endings. I have to confess that it just feels tediously necessary, rather than wrong to me. All that is potentially happening is that we identify a simple and, at least I hope, non-exploitable transformation of the file which may have occurred. The alternative would be to force (presumably) Unix line endings, but this seems highly unsatisfactory to me, especially as the one likely to get pinged about the confusing bug reports 😉 |
I guess we could re-implement patch, yes. There would be another benefit, in that I've certainly hit a problem with the My only concern is how easy it is to replicate the same kind of fuzzy patching as standard patch, but in fairness I think that that's just a matter of seeing whether the patch applies near to the line numbers given when it doesn't apply exactly - I think the context still has to match exactly. |
Impact for each of these:
|
Thanks for the work! |
@rjbou - no, Git can't handle this, that's the point. You still don't know what's actually checked into the repo (see myriad guides on how to convert repos to use core.autocrlf which weren't using it correctly in the first place - the OCaml change was a nightmare) |
I'm extremely happy to polish this patch off properly for 2.0.0, it just needs the further discussion as to whether it's headed in the correct direction. |
Incidentally, for context diffs, one thought I had had was to eliminate the processing - i.e. just emit context diffs as they are, possibly with a warning. Surely no one is actually using them these days for real patches? (that would deal with the |
src/core/opamSystem.ml
Outdated
|> OpamStd.Option.map_default f 1 | ||
in | ||
`Processing (`Unified, orig, target, crlf, `Chunk (neg, pos)) | ||
with _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to keep global exception handling, especially C-c management, try [...] with _ -> [..]
should never be used. See OpamStd.Exn.fatal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed - I feel the need for a compiler warning to help lazy programmers like myself with this one!
OK, this needs some more testing (principally that I must rebase #3260 to use this version instead of the old one), but this revised version:
There is a note about dealing with renaming files. This is something Git certainly does but, as noted in ocaml/opam-repository#11207 (comment), this is not portable, so its absence here is not (I think) critical - it should be the case that the patch will pass through unaltered since the code would incorrectly identify it as a new file. |
Thanks! About the patches, since we already restricted patches to apply with And about the hashing function, does that mean that hashing a whole file is now equivalent to hashing the string as obtained through |
of operation - in the Lines mode, it returns a string list of the lines, in | ||
CrLf mode, it returns true if every line ends "\r\n". *) | ||
type _ read_return = Lines : string list read_return | ||
| CrLf : bool read_return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just define two functions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where's the fun in that 🙂 More seriously, they're very similar, but not quite identical functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd argue that you could factorise the core in read_lines_aux
, and define the two functions with just the last lines. But well, doesn't really matter ^^.
sufficient documented detail of Context diffs to be able to parse them | ||
without resorting to reverse-engineering code. It is unusual to see them | ||
these days, so for now opam just emits a warning if a Context diff file is | ||
encountered and does no processing to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe just add a line to the doc where it explains the patch format (and that they must apply with -p 1
) that they should be unified diffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the sentence in the docs
The aim here is that patch files should never emit "stripping CR warnings", but CR endings will be left/added as necessary if the target file requires them.
Docs updated and I also factored out the chunk processing code to |
@AltGr - indeed, the problem is that you don't know what the patch files themselves may be. While we can enforce that opam-repository uses Unix line-endings, you can't in general enforce that, even by |
For the hashing, sort of - it means that it will hash the file both as if it had been read with |
Thanks! |
A rather more controversial part of #3260!
This relies ontwothree bits of #3346, but the actual content of this PR still needs work/discussion anyhow.Further comment
to followbelow.