-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a builtin for comptime source information #2029
Comments
This was previously rejected in #577, but you have made these new arguments:
I want to point out that a full stack trace does not need to be captured - in fact only a single The problem of debug info not being efficient is a problem with the status quo implementation in userland, but that's not a fundamental problem. Applications which want to resolve debug information during runtime can pay an upfront cost to prepare the debug info structures, and then during the main program execution, lookups by address can be fast. However I still think that a more ideal solution would be simply printing addresses, and having the log processing (which can happen on a different machine, with debug info present that is not necessarily present on the main machine) perform the address to source translations. As for optimizations, it's true that I'd like to push back on this. I think that a strategy with |
Ha, I failed to find this previous issue. I forgot about It could be that one's writing a program where they would rather pay for a constant string of every file in their executable, rather than forcing If we can guarantee this though, I agree with you on the rest. I'm just not 100% positive that will be sufficiently accurate and fast. |
This would be guaranteed to be correct. The call to log would always be a
Indeed it is possible for the optimizer to merge multiple source lines into the same machine code, so that it becomes ambiguous which "parent" source line would correspond to a given instruction. I had forgotten, but here's an example of when that happened to me at OkCupid. This is not just a problem with the technique described in this issue; it is a problem with the DWARF info emitted by gcc/clang. This problem exists in general for debugging using these debug symbols. Arguably the debug info should be able to represent when a single machine code address corresponds to multiple source locations. But even if that improvement were made, we're left facing the reality that Here is how I would like to move forward with this issue:
|
@Sahnvour here's your gist: const warn = @import("std").debug.warn;
fn log(msg: []const u8) void {
warn("{} {}\n", @returnAddress(), msg);
unreachable;
}
fn foo(n: u32) u32 {
@noInlineCall(log, "log from foo");
return n * 2;
}
fn bar(n: u32) u32 {
return @inlineCall(foo, n);
}
pub fn main() void {
warn("{}\n", bar(13));
} output:
You're pointing out that according to the stack trace, const std = @import("std");
const warn = std.debug.warn;
fn log(msg: []const u8) void {
const outstream = &(std.io.getStdErr() catch unreachable).outStream().stream;
std.debug.printSourceAtAddress(std.debug.getSelfDebugInfo() catch unreachable, outstream, @returnAddress(), true) catch unreachable;
}
fn foo(n: u32) u32 {
@noInlineCall(log, "log from foo");
return n * 2;
}
fn bar(n: u32) u32 {
return @inlineCall(foo, n);
}
pub fn main() void {
warn("{}\n", bar(13));
}
It says |
Well on my machine, here's the output with your modification:
If I force inline the call to |
I see, that's good to know. Regardless of the outcome of this issue, I think it will be worth having a chat with some LLVM folks and see if this is a fundamental limitation of the PDB format, or if LLVM's debug info code needs to be improved. If it is a fundamental limitation, or if improving LLVM's debug info is unlikely to happen, then that's pretty good reasoning for accepting this issue. |
This is on par with my experience debugging on windows, WinDBG/VS' debugger are great but sometimes they just can't tell you where the inlined instructions are coming from. |
A potential use for FILE/__dirname that I came across: If you have a comptime utility function that calls |
Here's a potentially significant downside to offering this: Any use of it makes function-level incremental compilation impossible. Consider this scenario: You have a 1000 line file with a bunch of log or panic calls that include line number information. You add a print call for debugging to a small function near the top of the file. But uh oh, now every single function in the rest of the file needs to be re-compiled instead of just the function you modified, because the source information they used changed. If Zig wants to eventually have function-level incremental recompilation, which IMO is the best promise for <100ms common case debugging recompiles on arbitrarily sized projects, it should try to avoid this problem. There's a separate but related problem where you do this within your compiler by entangling span information into your data structures. But that can be fixed with heavy refactoring, the actual behavioural changes can't be fixed as easily. See Rust encountering this problem: rust-lang/rust#47389 |
Thanks for bringing this up. That's a really good point and one that we hadn't considered before in this proposal. Ability to do function-level incremental compilation is definitely something that is planned and so this is a serious consideration. |
I should note that the proposed use case for accessing the current file for path for relative compile time file lookup should be fine from this perspective. Another thing that should be fine is accessing the current function name. The file+function+message should be sufficient to locate the source of for example log messages fairly easily. I like your idea of using the return address and debug info instead is also a great idea. I think a log library that output like |
That's serious concern, because I think source information is an important feature, but fast incremental compilation is even more so. Note that the proposal currently is, in my opinion, not final, regarding the usability issue described in my first comment. I thought before of doing something similar to what's described in the Rust issue, and maybe this can help achieve a better proposal. Semi-serious workaround: only count line numbers inside functions, 0 being associated to the line of the first token of the declaration. |
This might be obvious, but if we want to enable independent (-> incremental) compilation on a per-function basis, then we just need an intermediary "lookup table" step by function (/ other entry id). So Note that this effectively demotes |
Would it be reasonable to just have a built-in that would provide comptime builtin.TypeInfo on the symbol referenced by @returnaddress? |
Thank you all for the considerations here. I refreshed my memory today by reading everyone's inputs, and put some thought into this, and I believe we can do it. NameSince it will be required to call this function explicitly (unlike using the C preprocessor for example), it is important that it has a short name that can be repeated at many callsites without issue. So for that I propose Return TypeLet's start with just this: const SourceLocation = struct {
file: []const u8,
fn_name: []const u8,
line: u32,
column: u32,
}; Incremental CompilationThe key strategy here to make this work with incremental compilation is to make the return value of Calls Outside Function ScopeCalling outside function scope will be compile errors. comptime {
var x = @src(); // error: source information only available within function scope
} Other ProposalsWith this implemented, #5547 follows, with |
@andrewrk what if it took an address? e.g. |
There are a lot of details and questions with that tiny code example. You're going to need to do a bunch more leg work of proposing & explaining |
@andrewrk I support your decision because I believe |
The former needs limited compiler support but requires debugging symbols and may not be always precise, and the latter has hidden behaviour. |
Maybe it should be a separate proposal, but it would be useful to be able to have a function name available at comptime, similar to how struct names are available at comptime. if (comptime !structHasTraits(T))
@compileError("Missing traits for type "++@typeName(T)); But I can't do this: if (comptime !fnHasNeededParams(func))
@compileError("Cannot generate for fn "++@fnName(func)++" ("++@typeName(@TypeOf(func))++")"); |
This seems reasonable, some things to consider when thinking about proposals:
|
We should probably define what |
I'll be sure to play around with |
One of the main motivating use cases for this language feature is tracing/profiling tools, which expect null-terminated strings for these values. Since the data is statically allocated, making them additionally null-terminated comes at no cost. This prevents the requirement of compile-time code to convert to null-termination, which could increase the compilation time of code with tracing enabled. See #2029
I think #5675 could be relevant to this proposal. Source location struct could look like this: const SourceLocation = struct {
comptime file: []const u8,
comptime fn_name: []const u8,
line: u32,
column: u32,
}; |
* Indices of referenced captures * Line and column of `@src()` The second point aligns with a reversal of the "incremental compilation" section of ziglang#2029 (comment). This reversal was already done as ziglang#17688 (46a6d50), with the idea to push incremental compilation down the line. My proposal is to keep it as comptime-known, and simply re-analyze uses of `@src()` whenever their line/column change. I think this decision is reasonable for a few reasons: * The Zig compiler is quite fast. Occasionally re-analyzing a few functions containing `@src()` calls is perfectly acceptable and won't noticably impact update times. * The system described by Andrew in ziglang#2029 is currently vaporware. * The system described by Andrew in ziglang#2029 is non-trivial to implement. In particular, it requires some way to have backends update a single global in certain cases, without re-doing semantic analysis. There is no other part of incremental compilation which requires this. * Having `@src().line` be comptime-known is useful. For instance, ziglang#17688 was justified by broken Tracy integration because the source line couldn't be comptime-known.
* Indices of referenced captures * Line and column of `@src()` The second point aligns with a reversal of the "incremental compilation" section of ziglang#2029 (comment). This reversal was already done as ziglang#17688 (46a6d50), with the idea to push incremental compilation down the line. My proposal is to keep it as comptime-known, and simply re-analyze uses of `@src()` whenever their line/column change. I think this decision is reasonable for a few reasons: * The Zig compiler is quite fast. Occasionally re-analyzing a few functions containing `@src()` calls is perfectly acceptable and won't noticably impact update times. * The system described by Andrew in ziglang#2029 is currently vaporware. * The system described by Andrew in ziglang#2029 is non-trivial to implement. In particular, it requires some way to have backends update a single global in certain cases, without re-doing semantic analysis. There is no other part of incremental compilation which requires this. * Having `@src().line` be comptime-known is useful. For instance, ziglang#17688 was justified by broken Tracy integration because the source line couldn't be comptime-known.
* Indices of referenced captures * Line and column of `@src()` The second point aligns with a reversal of the "incremental compilation" section of ziglang#2029 (comment). This reversal was already done as ziglang#17688 (46a6d50), with the idea to push incremental compilation down the line. My proposal is to keep it as comptime-known, and simply re-analyze uses of `@src()` whenever their line/column change. I think this decision is reasonable for a few reasons: * The Zig compiler is quite fast. Occasionally re-analyzing a few functions containing `@src()` calls is perfectly acceptable and won't noticably impact update times. * The system described by Andrew in ziglang#2029 is currently vaporware. * The system described by Andrew in ziglang#2029 is non-trivial to implement. In particular, it requires some way to have backends update a single global in certain cases, without re-doing semantic analysis. There is no other part of incremental compilation which requires this. * Having `@src().line` be comptime-known is useful. For instance, ziglang#17688 was justified by broken Tracy integration because the source line couldn't be comptime-known.
* Indices of referenced captures * Line and column of `@src()` The second point aligns with a reversal of the "incremental compilation" section of ziglang#2029 (comment). This reversal was already done as ziglang#17688 (46a6d50), with the idea to push incremental compilation down the line. My proposal is to keep it as comptime-known, and simply re-analyze uses of `@src()` whenever their line/column change. I think this decision is reasonable for a few reasons: * The Zig compiler is quite fast. Occasionally re-analyzing a few functions containing `@src()` calls is perfectly acceptable and won't noticably impact update times. * The system described by Andrew in ziglang#2029 is currently vaporware. * The system described by Andrew in ziglang#2029 is non-trivial to implement. In particular, it requires some way to have backends update a single global in certain cases, without re-doing semantic analysis. There is no other part of incremental compilation which requires this. * Having `@src().line` be comptime-known is useful. For instance, ziglang#17688 was justified by broken Tracy integration because the source line couldn't be comptime-known.
I want to propose a way to query the compiler in order to get information on the context of the compilation, at call site. In particular, this is mostly used to get source filename, line, function name.
Classical uses of this feature are logging, asserts, scoped variables (instrumentation, debugging...) and probably others I can't think of.
In C and C++ this is done through the
__FILE__
and__LINE__
macros which are expanded by the preprocessor.In Rust, there are
file!
andline!
macros.D also makes use of
__FILE__
and such, though I don't know if it's actually a preprocessor.And so on.
At the moment, something similar could be achieved by capturing a stack trace and looking at debug info, but this would not be very efficient. It would also be easily disturbed by the optimizations performed.
Zig could provide a
@sourceInfo
builtin function, returning an instance along the lines ofExample usage:
Example output:
Now, this leaves a usability concern: the other languages cited as examples all provide a way to the library writer to make get this information from client code. This means they can provide a
log(msg: string)
function or macro that transparently uses the source information.In current Zig, the only way is to explicitly call
@sourceInfo()
and pass it to the function expecting it.One solution is to provide a
@callerSourceInfo
to get the source information from the call site into the callee function. But it doesn't play nice with function pointers or extern functions.The text was updated successfully, but these errors were encountered: