-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leverage OsString much more deeply througout #112
Conversation
This would be so nice! I ran into a lot of conversions when working on #104 between PathBuf <-> OsString <-> String. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got a few comments, nothing major. Feel free to fix as much as is reasonable and then land this.
env_vars: &[(OsString, OsString)], | ||
pool: &CpuPool) | ||
-> SFuture<HashResult<T>> | ||
{ | ||
let me = *self; | ||
let CCompilerHasher { parsed_args, executable, executable_digest, compiler } = me; | ||
let result = compiler.preprocess(creator, &executable, &parsed_args, cwd, env_vars, pool); | ||
let out_file = parsed_args.output_file().into_owned(); | ||
let out_pretty = parsed_args.output_pretty().into_owned(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering recently if we shouldn't make output_file
return an Rc<String>
or something like that, given how much we copy it around just for logging purposes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah one thing I was considering in the logging PR was to add the output file to the "local logger's scope" and that way it'd automatically get prepended to all log messages. What would you think of that approach? (I can change that PR over there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make things a lot nicer vs. all the cloning we do just to have useful log messages.
.collect::<String>(); | ||
hash_key(&executable_digest, &arguments, &env_vars, &preprocessor_result.stdout) | ||
hash_key(&executable_digest, | ||
&parsed_args.common_args, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're undoing the "remove object file" bit here. Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right sorry I meant to ask you about this. I wasn't really sure if this was still necessary. The test here is against output_file
which is what prompted me to rename the relevant method to output_pretty
because the return value of output_file
is just the file name, not the full path.
In that sense I'm not sure that this was 100% working before, and I otherwise couldn't figure out what it was intended to be doing. I thought the output file was already missing from the common_args
?
pub fn hash_key(compiler_digest: &str, arguments: &str, env_vars: &[(OsString, OsString)], | ||
pub fn hash_key(compiler_digest: &str, | ||
arguments: &[OsString], | ||
env_vars: &[(OsString, OsString)], | ||
preprocessor_output: &[u8]) -> String | ||
{ | ||
// If you change any of the inputs to the hash, you should change `CACHE_VERSION`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So HashToSha1
will not change hashes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah I think in practice this is going to end up disturbing the hashes a bit. The motivation of HashToSha1
was to avoid transmuting OsStr
to bytes (which while valid today isn't something we'd like to commit to in libstd).
Is changing the hash though a problem? I could back that out if it's best left for another time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the hash is fine, but if you're doing so please bump the cache version as well so we're explicit about it.
src/compiler/compiler.rs
Outdated
|
||
/// Returns an iterator over the results of this compilation. | ||
/// | ||
/// Each item is a descriptive (and unique) name of the outpu paired with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling nit: 'output'. Thanks for adding a doc comment!
src/compiler/gcc.rs
Outdated
"-o" => output_arg = it.next(), | ||
"-gsplit-dwarf" => { | ||
match arg.to_str() { | ||
Some("-c") => compilation = true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little clunky, having to put Some
on every match arm. Could you do something like:
let arg = match arg.to_str() {
None => continue,
Some(s) => s,
};
match arg {
...
(I have no idea if that's actually valid Rust)
src/compiler/msvc.rs
Outdated
write!(f, "{}: {} ", objfile, parsed_args.input)?; | ||
let mut f = File::create(cwd.join(depfile))?; | ||
|
||
// TODO: these calls to `display` are lossy ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really worse than what we had before, where a source file name that wasn't valid Unicode would just fail somewhere along the way, but it's a little worrying that we could be writing bad data here and silently breaking things. In any event, you don't have to fix this right now. I was going to say you could use local_encoding
like we're using to decode the MSVC output, but that only deals with turning strings into bytes and that doesn't actually help. I guess we could open an issue on local_encoding
to support encoding &[u16]
or some such to bytes, and then pass it input.to_wide()
? It'd basically just need to call WideCharToMultiByte
, and I'm sure it's already doing something very similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh looks like WideCharToMultiByte
may actually be supported, I'll play around with calling that (never tried it before!)
{ | ||
// While we could go the extra mile here and handle non-utf8 `OsString` | ||
// instances the rustc compiler certainly does not. With that knowledge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah!
assert_eq!(os_str_bytes(OsStr::new("你好")), "你好".as_bytes()); | ||
pub trait OsStrExt { | ||
fn starts_with(&self, s: &str) -> bool; | ||
fn split_prefix(&self, s: &str) -> Option<OsString>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep thinking that these sorts of things would be really nice in std.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely! We on the libs team have often wanted some form of string-like API added to OsStr
as well, we just never figured out the best way to do so... For now less efficient implementations :(
// Attempt to interpret this OsStr as utf-16. This is a pretty "poor | ||
// man's" implementation, however, as it only handles a subset of | ||
// unicode characters in `s`. Currently that's sufficient, though, as | ||
// we're only calling `starts_with` with ascii string literals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clever, but not so clever that it's hard to understand. :)
src/util.rs
Outdated
} | ||
|
||
fn finish(&self) -> u64 { | ||
panic!("not supposed to be "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This panic string feels unfinished?
Oh! But you should fix whatever caused the CI bustage, obviously. Possibly just an issue being rebased over one of your other PRs?
|
e74738d
to
037cd14
Compare
Ok I think I've updated with all comments, but I'll hold off on merging until I hear back on removing the object file bit changing the hash value. If you wanna take a peek at what I did to |
ea2d2fa
to
ef203dc
Compare
@luser just to confirm, is removing the object file handling ok there? |
Given that the code as written was broken, that's fine. We can reopen the issue that made that change originally. |
Those changes look reasonable. You've got CI bustage, maybe it's just from rebasing over 70b4426 ? That added some new tests that probably need the same fixup you did for other tests. Feel free to land this once you fix that. |
This initially started out tackling mozilla#93 and ended up plumbing the `OsStr` type basically everywhere! This knocked out a TODO or two along the way and should remove some pesky unwraps and/or error handling around handling of arguments and paths that may not be unicode. Some notable other points are: * A few utilities were added to `src/util.rs` to assist with handling os strings * The Rust implementation doesn't deal with os strings in argument parsing at all, but otherwise internally stores os strings for maximal compatibility (more comments in the code). * Some unsafe transmutes were removed in util in favor of a more stable implementation. * The `output_file` function was renamed to `output_pretty` as it turns out it's not actually the actual output file but rather just a pretty description of it (for logging and such). Closes mozilla#93
Ok thanks for the info! I've opened an issue at #121 |
This initially started out tackling #93 and ended up plumbing the
OsStr
typebasically everywhere! This knocked out a TODO or two along the way and should
remove some pesky unwraps and/or error handling around handling of arguments and
paths that may not be unicode.
Some notable other points are:
src/util.rs
to assist with handling os stringsall, but otherwise internally stores os strings for maximal compatibility
(more comments in the code).
implementation.
output_file
function was renamed tooutput_pretty
as it turns out it'snot actually the actual output file but rather just a pretty description of it
(for logging and such).
Closes #93