-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: proc macro include!
#3200
base: master
Are you sure you want to change the base?
RFC: proc macro include!
#3200
Conversation
we'd likely also want a way to list files in a directory, though that may be more difficult to integrate into build systems |
we may want to specify that a non-existent file/directory produce a |
Co-authored-by: Jacob Lifshay <[email protected]>
It would be useful to somehow allow |
I agree that this is desirable. As such, I'm torn on whether This RFC I'd like to keep focused on the "read file via the build system" architecture, so introducing split spans architecture would (imo) overextend the RFC. Perhaps the best short term approach is to just drop |
Two points:
|
How about: trait BytesBuf: AsRef<[u8]> {
// may be expensive
fn into_vec(self: Box<Self>) -> Vec<u8>;
}
fn include_bytes<P: AsRef<str>>(path: P) -> Result<Box<dyn BytesBuf>, std::io::Error>; |
I would really like for the proc macro version of |
This would open tonnes of doors and allow extending the rust language to work with things like single-file components in frontend frameworks written in rust. |
I've finally gotten around to updating the RFC text for the comments here. Changelog:
|
/// | ||
/// NOTE: some errors may cause panics instead of returning `io::Error`. | ||
/// We reserve the right to change these errors into `io::Error`s later. | ||
fn include_bytes<P: AsRef<str>>(path: P) -> Result<Vec<u8>, std::io::Error>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense for include_bytes
to return Literal
as well, or would that not be possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should work because Literal
can be a byte string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, yeah, I overlooked that possibility.
The main limitation is that the only current interface for getting the contents out of a Literal
is to ToString
it. syn
does have a .value()
for LitByteStr
as well as LitStr
, though, so I guess it's workable.
It's probably not good to short term require debug escaping a binary file to reparse the byte string literal if a proc macro is going to post process the file... but if it's just including the literal, it can put the Literal
in the token stream, and we can offer ways to extract (byte) string literals without printing the string literal in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one limitation which needs to be solved is how do spans work. Do we just say that the byte string literal contains the raw bytes of the file (even though that would be illegal in a normal byte string, and invalid UTF-8), maybe as a new "kind" of byte string, so span offsets are mapped directly with the source file? Or are there multiple span positions (representing a \xNN
in the byte string) which map to a single byte in the source file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what bytes are not allowed in byte string literals? Does the literal itself have to be valid UTF-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A Rust source file must be valid UTF-8. Thus, the contents of a byte string literal in the source must be valid UTF-8.
Bytes that are not < 0x80 thus must be escaped to appear in a byte string literal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then another question that's worth making explicit: what does it even mean for rustc to report a span into a binary file?
I think binary includes are better served by a different API that lets rustc point into generated code, rather than trying to point into an opaque binary file.
Does this allow you to split the Edit: nvm I see there is a |
- That which `include!` is relative to in the source file expanding the macro. | ||
- That which `fs` is relative to in the proc macro execution. | ||
|
||
Both have their merits and drawbacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to support both options would be to take a Span
that the path is relative to. Then it would make multi-level includes easier (the macro includes a path relative to the Rust source file, then the included file references another relative file so that needs to be included based on the Span
from the first proc_macro::include_str
call).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would Span::mixed_site
be relative to?
Also, that would kinda soft-block the feature on though I suppose requiring a span would be strictly more powerful than Span::def_site
, while the RFC is currently written such that additional unstable features (such as span subslicing) are incremental improvements not required for the functionality to be useful...include!
-style base path, so that fits into the same category.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it should just behave the exact same as a include_str!("..")
macro invocation whose tokens carry a mixed_site span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
macro_rules! x {
() => {
include_str!("a")
};
}
Somewhat surprisingly, this looks for a file called "a"
relative to the file in which x!()
is invoked, not relative to the file that contains the definition above.
how is this not already possible with |
listing files is already possible using |
Additionally the |
(This would also likely be blocked in unstable limbo by the same concerns as the tracked path interface.) FWIW, the The "perfect" solution (with respect to tracking only) is to use a WASI target or similar in order to instrument all environment access, such that it can be transparently instrumented, sandboxed, and whatever else the compiler sees as reasonable. For what this RFC is directly trying to address — spanned manipulation of newly accessed files — though, this API surface is still required even with perfect instrumentation of environment access. |
Allow
include!
to be implemented in proc macros, by adding aproc_macro
API to read files asVec<u8>
,String
, orTokenStream
. If the file is read asTokenStream
, it is givenSpan
s appropriate for diagnostics to point into the read file. In all cases, the build system knows which file(s) have been read, and can cache results / rerun the macro as desired.Rendered