-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error source; free leaked allocations; fix refcounting; BSTR is wide, not UTF-8; convert return_hr to expression #10
Conversation
b7e97cc
to
bfc8b95
Compare
In case we want to walk the .source() trace and print all inner exceptions (when these are usually rewrapped in thiserror types in higher layers), this exception needs to be marked as source to be returned from that function.
bfc8b95
to
a3f3afb
Compare
src/intellisense/wrapper.rs
Outdated
for i in 0..result_length { | ||
// get_children allocates a buffer to pass the result in. Vec will | ||
// free it on Drop. | ||
Vec::from_raw_parts(result, result_length as usize, result_length as usize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be unsafe because result
is from a different allocator then what Vec::drop
will call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more inconvenient than thought because the tests use CoTaskMemFree
, but the GetChildren
function allocating these seem to override the thread allocator:
I don't think this overrides what allocator is used by CoTaskMemFree
, this is merely a concept in DXC?
So, while it is possible to allocate things on a different IMalloc
(we have a SetMalloc
on IDxcLibrary
) I don't think that's used for this particular list (CoTaskMemAlloc
), and should hence be safely freed using CoTaskMemFree
on Windows and free
everywhere else?
Perhaps we should open up an issue in DXC to clarify this interface, and provide a convenient function to do the cleanup for us in their OS-dependent way, as we can't be guaranteed this never changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Windows CI didn't fail meaning I didn't mistype pub use winapi::um::combaseapi::CoTaskMemFree;
nor forgot the feature flag, but I have not tested actually running this code on Windows yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think CoTaskMemFree
here is fine because the DXC unit-tests also call CoTaskMemFree
here.
c2098e8
to
3aabfc6
Compare
Solves the following Valgrind trace: ==168242== 24 bytes in 1 blocks are definitely lost in loss record 2 of 14 ==168242== at 0x483A77F: malloc (vg_replace_malloc.c:307) ==168242== by 0x5AFC3F4: ??? ==168242== by 0x5AFCFB3: ??? ==168242== by 0x12BD23: hassle_rs::intellisense::ffi::IDxcCursor::get_children (macros.rs:108) ==168242== by 0x1167E6: hassle_rs::intellisense::wrapper::DxcCursor::get_all_children (wrapper.rs:210) ==168242== by 0x111B8A: intellisense_tu::main (intellisense-tu.rs:43) ==168242== by 0x112F5A: core::ops::function::FnOnce::call_once (function.rs:227) ==168242== by 0x1132FD: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:137) ==168242== by 0x113510: std::rt::lang_start::{{closure}} (rt.rs:66) ==168242== by 0x14F810: call_once<(),Fn<()>> (function.rs:259) ==168242== by 0x14F810: do_call<&Fn<()>,i32> (panicking.rs:373) ==168242== by 0x14F810: try<i32,&Fn<()>> (panicking.rs:337) ==168242== by 0x14F810: catch_unwind<&Fn<()>,i32> (panic.rs:379) ==168242== by 0x14F810: std::rt::lang_start_internal (rt.rs:51) ==168242== by 0x1134E6: std::rt::lang_start (rt.rs:65)
Make return_hr more versatile by providing Ok or Err as expession instead of forcing a return out of the current function. This follows Rusts convention to provide a block result as an unterminated expression on the last line, and allows it to be used in an expression context including with the question mark operator.
On Windows the right API needs to be used to free this pointer in the same way as it was allocated by DXC (using CoTaskMemAlloc). On non-Windows platforms this call is [delegated] to `free` from libc. [delegated]: https://github.com/microsoft/DirectXShaderCompiler/blob/a8d9780046cb64a1cea842fa6fc28a250e3e2c09/include/dxc/Support/WinAdapter.h#L46
The same call to SysFreeString happens in the Windows counterpart. Solves the following leaks found by valgrind: ==213075== 40 bytes in 1 blocks are definitely lost in loss record 4 of 13 ==213075== at 0x483A77F: malloc (vg_replace_malloc.c:307) ==213075== by 0x5B03395: ??? ==213075== by 0x5AFC77E: ??? ==213075== by 0x12C766: hassle_rs::intellisense::ffi::IDxcCursor::get_display_name (macros.rs:108) ==213075== by 0x116373: hassle_rs::intellisense::wrapper::DxcCursor::get_display_name (wrapper.rs:236) ==213075== by 0x11286D: intellisense_tu::main (intellisense-tu.rs:33) ==213075== by 0x11371A: core::ops::function::FnOnce::call_once (function.rs:227) ==213075== by 0x11348D: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:137) ==213075== by 0x111E10: std::rt::lang_start::{{closure}} (rt.rs:66) ==213075== by 0x150A20: call_once<(),Fn<()>> (function.rs:259) ==213075== by 0x150A20: do_call<&Fn<()>,i32> (panicking.rs:373) ==213075== by 0x150A20: try<i32,&Fn<()>> (panicking.rs:337) ==213075== by 0x150A20: catch_unwind<&Fn<()>,i32> (panic.rs:379) ==213075== by 0x150A20: std::rt::lang_start_internal (rt.rs:51) ==213075== by 0x111DE6: std::rt::lang_start (rt.rs:65) ==213075== ==213075== 56 (16 direct, 40 indirect) bytes in 1 blocks are definitely lost in loss record 5 of 13 ==213075== at 0x483ADEF: operator new(unsigned long) (vg_replace_malloc.c:342) ==213075== by 0x4FD058D: ??? ==213075== by 0x4FC04D9: ??? ==213075== by 0x40112DD: call_init.part.0 (in /usr/lib/ld-2.32.so) ==213075== by 0x40113C7: _dl_init (in /usr/lib/ld-2.32.so) ==213075== by 0x4A100E4: _dl_catch_exception (in /usr/lib/libc-2.32.so) ==213075== by 0x4015704: dl_open_worker (in /usr/lib/ld-2.32.so) ==213075== by 0x4A10087: _dl_catch_exception (in /usr/lib/libc-2.32.so) ==213075== by 0x4014F3D: _dl_open (in /usr/lib/ld-2.32.so) ==213075== by 0x489434B: ??? (in /usr/lib/libdl-2.32.so) ==213075== by 0x4A10087: _dl_catch_exception (in /usr/lib/libc-2.32.so) ==213075== by 0x4A10152: _dl_catch_error (in /usr/lib/libc-2.32.so) ==213075== ==213075== 124 bytes in 3 blocks are definitely lost in loss record 9 of 13 ==213075== at 0x483A77F: malloc (vg_replace_malloc.c:307) ==213075== by 0x5B03395: ??? ==213075== by 0x5AFC77E: ??? ==213075== by 0x12C766: hassle_rs::intellisense::ffi::IDxcCursor::get_display_name (macros.rs:108) ==213075== by 0x116373: hassle_rs::intellisense::wrapper::DxcCursor::get_display_name (wrapper.rs:236) ==213075== by 0x112F63: intellisense_tu::main (intellisense-tu.rs:52) ==213075== by 0x11371A: core::ops::function::FnOnce::call_once (function.rs:227) ==213075== by 0x11348D: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:137) ==213075== by 0x111E10: std::rt::lang_start::{{closure}} (rt.rs:66) ==213075== by 0x150A20: call_once<(),Fn<()>> (function.rs:259) ==213075== by 0x150A20: do_call<&Fn<()>,i32> (panicking.rs:373) ==213075== by 0x150A20: try<i32,&Fn<()>> (panicking.rs:337) ==213075== by 0x150A20: catch_unwind<&Fn<()>,i32> (panic.rs:379) ==213075== by 0x150A20: std::rt::lang_start_internal (rt.rs:51) ==213075== by 0x111DE6: std::rt::lang_start (rt.rs:65) Fixes: 94fbad7 ("Support libdxcompiler.so on Linux (#5)")
3aabfc6
to
f1ad454
Compare
Valgrind shows that we are leaking the DxcIncludeHandlerWrapper allocation: ==96758== 485 (80 direct, 405 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 17 ==96758== at 0x483A77F: malloc (vg_replace_malloc.c:307) ==96758== by 0x1A1E4B: alloc::alloc::alloc (alloc.rs:74) ==96758== by 0x1A1F09: alloc::alloc::Global::alloc_impl (alloc.rs:153) ==96758== by 0x1A2069: <alloc::alloc::Global as core::alloc::AllocRef>::alloc (alloc.rs:203) ==96758== by 0x1A1D9A: alloc::alloc::exchange_malloc (alloc.rs:281) ==96758== by 0x199A1A: new<hassle_rs::wrapper::DxcIncludeHandlerWrapper> (boxed.rs:175) ==96758== by 0x199A1A: hassle_rs::wrapper::DxcCompiler::prep_include_handler (wrapper.rs:248) ==96758== by 0x199D6F: hassle_rs::wrapper::DxcCompiler::compile (wrapper.rs:278) ==96758== by 0x195C70: hassle_rs::utils::compile_hlsl (utils.rs:115) ==96758== by 0x127BDF: include::main (autogen_ops.rs:0) ==96758== by 0x1275CA: core::ops::function::FnOnce::call_once (autogen_context.rs:1683) ==96758== by 0x127ADD: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:137) ==96758== by 0x127530: std::rt::lang_start::{{closure}} (rt.rs:66) This happens because of Box::into_raw, which moves the pointer out of the box. Taking this pointer out of a referenced Box makes it live on till the end of the function where it is appropriately dropped. Properly freeing DxcIncludeHandlerWrapper exposes a second issue: The blobs are now freed but have already been freed within DXC becaues of a missing refcount: we are supposed to increment it before writing the pointer to it to `include_source` [1]. Now that the blobs have a proper refcount and live on until DXC cleans them up when not needed, they do not need to be kept alive inside DxcIncludeHandlerWrapper anymore. Finally, clean up the unnecessary clone of `source`: This String is already retrieved by value and can be moved straight into the Rc. [1]: https://github.com/microsoft/DirectXShaderCompiler/blob/a8d9780046cb64a1cea842fa6fc28a250e3e2c09/tools/clang/unittests/dxc_batch/dxc_batch.cpp#L553-L554
This turned out to be a deeper rabbit hole than anticipated. A couple mistakes were made in the Linux conversion for DXC doesn't implement their Finally leaks and use-after-free showed up around |
|
||
if let Some(source) = source { | ||
let pinned_source = Rc::new(source.clone()); | ||
let pinned_source = Rc::new(source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this where we should use a Pin<Rc<String>>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably yeah, I'm not sure 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this makes me curious how it works. The documentation really only specifies self-referential structs and moving of that object, but that is of no relevance here: even if the String
moves, the underlying slice on the heap is unlikely to move.
However, Pin
works on pointer types. String
is a "pointer" of type &str
to some owned data, and this is clear when interacting with a Pin<String>
: it will only ever give the pointer type &str
back meaning we cannot do any disastrous modifications like .reserve
(which usually reallocates) that are only available on the String
type (whenever we have mutable access to it). In other words, it appears Pin
takes over the ownership for us and only ever allows us to retrieve the value of the underlying pointer.
Note that I should have been more rigorous in cleaning the Rc
up too: this value is not ever borrowed more than once and hence has no effect (except that the value is technically borrowed a second time inside IDxcBlobEncoding
, but Rust does not know that and cannot model it as the object exposing that interface is owned by DXC).
We [recently] fixed memory leaks when calling into the intellisense part of DXC and made the decision shortly after merging to [implement null-invariant `BSTR`s properly] in DXC's shimmed type for non-Windows, which was promptly accepted. Currently our implementation of `SysFreeString` together with that PR segfaults because it frees the pointer at offset `+4` of the allocation. That is corrected together with `SysStringLen` now reading the size of the string from this prefix instead of looking for a (wrong!) null character. At least our Windows and Linux implementation `utils.rs` is the same again. [recently]: #10 [implement null-invariant `BSTR`s properly]: microsoft/DirectXShaderCompiler#3250
We [recently] fixed memory leaks when calling into the intellisense part of DXC and made the decision shortly after merging to [implement null-invariant `BSTR`s properly] in DXC's shimmed type for non-Windows, which was promptly accepted. Currently our implementation of `SysFreeString` together with that PR segfaults because it frees the pointer at offset `+4` of the allocation. That is corrected together with `SysStringLen` now reading the size of the string from this prefix instead of looking for a (wrong!) null termination character. At least our Windows and Linux implementation `utils.rs` is the same again. [recently]: #10 [implement null-invariant `BSTR`s properly]: microsoft/DirectXShaderCompiler#3250
We [recently] fixed memory leaks when calling into the intellisense part of DXC and made the decision shortly after merging to [implement null-invariant `BSTR`s properly] in DXC's shimmed type for non-Windows, which was promptly accepted. Currently our implementation of `SysFreeString` together with that PR segfaults because it frees the pointer at offset `+4` of the allocation. That is corrected together with `SysStringLen` now reading the size of the string from this prefix instead of looking for a (wrong!) null termination character. At least our Windows and Linux implementation `utils.rs` is the same again. [recently]: #10 [implement null-invariant `BSTR`s properly]: microsoft/DirectXShaderCompiler#3250
…ize (#11) We [recently] fixed memory leaks when calling into the intellisense part of DXC and made the decision shortly after merging to [implement null-invariant `BSTR`s properly] in DXC's shimmed type for non-Windows, which was promptly accepted. Currently our implementation of `SysFreeString` together with that PR segfaults because it frees the pointer at offset `+4` of the allocation. That is corrected together with `SysStringLen` now reading the size of the string from this prefix instead of looking for a (wrong!) null termination character. At least our Windows and Linux implementation `utils.rs` is the same again. [recently]: #10 [implement null-invariant `BSTR`s properly]: microsoft/DirectXShaderCompiler#3250
After a look at current code with @DBouma we decided that taking references to ComPtrs is ever so slightly more natural. More importantly DXC returns objects with their refcount already incremented (detached from an internal ComPtr by taking the pointer out without decreasing the refcount). To ensure adequate cleanup it is convenient if these pointers can't be left lingering around by accident.
Simplify conversion loops and in particular pointer magic in
get_children
to shove a raw pointer into aComPtr
, which com_rs was not designed for. The returned pointer