-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FFI: ArrowArray::try_from_raw shouldn't clone #1425
Comments
Simply said, I don't think this is correct. That is the whole point of C data interface. The release callback is designed to be called by not only the one allocates the C data interface structure. |
Let me put in another way. Let's say in Java you create an ArrowArray and import in this lib using try_from_raw API, and this API clone the ArrowArray, which means there will be two struct point to the same data, one is in Jave another is in rust. Then
I bet currently in your Java program there is memory leak because of these. I think the way to fix your issue is to change the Java API design, to let other language like rust to pass in a FFI_ArrowArray::empty() struct pointer, and populate data into this struct, and let rust handle the memory itself. That's why I say "the one who allocate it should free it" |
fwiw, that is also what we do in arrow2 - we never take ownership of the pointers passed to our exporter; we just write to them (as they may have been allocated by someone else), see e.g. the example in https://jorgecarleitao.github.io/arrow2/ffi.html We did file a rust advisory on this last week, https://rustsec.org/advisories/RUSTSEC-2022-0012.html, as we incorrectly derived |
I'm not sure if you really use this FFI between Java and Rust. We have real use case to use it in an internal project. We found this issue and the fix works very well after upgrading to latest version. And, no, I don't detect memory leak there. Let me explain more. First, of course if you free the pointer in Java, you will release the data and Rust cannot access it anymore. If you really want to free it, make sure to clear up the release field in the struct. This is easy to do with Java FFI API. In our case, the struct in Java won't be released until the arrays are exhausted by Rust. So we don't even need to do that. In contrast, I don't know what you describe will work better. Let's say we don't clone it, then in Java and Rust, there is only one struct. So, if you free the pointer in Java, it still releases the data, isn't? The only possible is to check the release field of the struct to know if the struct is released or not by Rust. You even cannot mark it released. I don't see there is any better point. The only missing piece, here, is that we need to mark the source struct released in Rust side. This is required by "Moving an array" which talks about the case to clone the struct. I can do it later.
Although I think this will work but I don't think it is a good idea. By doing that, it means to add an additional JNI call to Rust side to get the empty struct pointer. We're likely to have less JNI call as possible. I agree that the struct should be released by who allocates it. Actually you cannot free Java struct in Rust, and vice versa. I don't think we did this or intend to do it. |
I get your point. You are calling rust from Java, so it's right you alloc the memory in Java. The problem is rust assume this pointer is from Arc::into_raw, but acctually it's not it's just a raw pointer from JNI. In this case, our solution should be add a new API like: /// creates a new [ArrowArray] from two pointers. Used to import from the C Data Interface.
/// # Safety
/// See safety of [ArrowArray]
/// # Error
/// Errors if any of the pointers is null
pub unsafe fn try_from_c(
array: *const FFI_ArrowArray,
schema: *const FFI_ArrowSchema,
) -> Result<Self> {
if array.is_null() || schema.is_null() {
return Err(ArrowError::MemoryError(
"At least one of the pointers passed to `try_from_raw` is null"
.to_string(),
));
};
Ok(Self {
array: Arc::new(*array), // notice we don't need to clone here, if will also solve your problem
schema: Arc::new(*schema),
})
} The idea is we don't need to Clone, why bother? As in your Java program you need to hold the pointer as long as you need it, and can only free it when the job is done. There are several drawbacks in the current solution:
About the link you mentioned Moving an array, please noted that it talks about consumer which is rust in this case, need to call release_call if it tries to move the data, seems have nothing to do with our discuess here.
In here you are talking about the producer side, but actually you can't just clear up the release field and forget about it. What if rust have any issue to release the data, Java as the producer, as the one alloc the memory, should still be responsible to release it. Again, "the one who allocate it should free it". |
The issue is our current FFI API design assume the pointer always comes from Arc::into_raw(), we should solve this problem, but Clone is not the solution. |
Are you sure it works?
|
I don't know how do you use the API. If the memory leak is happened because you need to hold the pointer in Java side, I think we can mark the source structs released after cloning it. So you can free the struct in Java. The issue you demo is also resolvable by marking the source struct released. |
How you run the code? From my side: ✘ 🍻 ~/github/arrow-rs/arrow flight-sql-client ± cargo test -- --nocapture test_clone_array_segfault_on_drop
Compiling arrow v10.0.0 (/Users/wangfenjin/github/arrow-rs/arrow)
Finished test [unoptimized + debuginfo] target(s) in 41.79s
Running unittests (/Users/wangfenjin/github/arrow-rs/target/debug/deps/arrow-01a41b09de9e05da)
running 1 test
error: test failed, to rerun pass '--lib'
Caused by:
process didn't exit successfully: `/Users/wangfenjin/github/arrow-rs/target/debug/deps/arrow-01a41b09de9e05da --nocapture test_clone_array_segfault_on_drop` (signal: 11, SIGSEGV: invalid memory reference) |
It should work otherwise the change you make should also break. they both do let ffi_array = (*array).clone(); |
I mean the solution you suggested:
|
Oh I haven't test the suggested solution yet, if it doesn't work maybe we can make it Box<> as the arrow2 API? I'm not so familiar with this |
No...It is basically about Rust ownership rule. In short,
|
Yeah, if you or others definitely insist we should not do
I don't agree with you here too. It is obviously talking out you can copy (i.e., clone) these structs, only if you mark source structs as released (I agree this is missed now). |
So, from my side, the previous behavior is definitely wrong (assuming the passed in raw pointers are always Arc pointers). I can think of the options:
WDYT? |
I think Box might be a better solution as we can avoid clone. If you don't have time I can also help on this issue. Thanks |
Okay, I can work on this in the weekend. See if Thanks for the discussion. |
I haven't followed all the discussion on this ticket, but it seems like it is well in hand. Let me know if I can do anything else to help |
Thanks @alamb |
Guys, not sure if my understanding is right, but I think this commit will break the design and create memory leak.
If we clone the FFI struct, then it means we need to free the pointer by ourself, but if we free FFI_ArrowArray, then the data in this Array will also be free? Which means we can't free the pointer(until the data are used and ready to free, but in reality we can't hold this useless pointer in a big project for such a long time), which create memory leak.
As to the question @viirya raised in #1333 , when manage memory, the one who allocate it should free it, which means in our case, we need to alloc the struct in rust and pass the pointer to java and then also free the memory in rust.
I suggest we revert this commit. cc @alamb @sunchao
Originally posted by @wangfenjin in #1334 (comment)
The text was updated successfully, but these errors were encountered: