-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java process terminates saying it could not free a specific address #51
Comments
It is happening in the local_snapshot_map::processed function in include/core/core_globals.h header file. The erase call to unprocessed map is sometimes leading to invalid pointer being freed. It looks to me that 'from' and 'to' may have to be swapped in some cases to avoid the issue. |
It doesn't look like What doesn't make sense to me is the if (ts >= later) test on line 234. That should almost certainly be if (ts <= later) or perhaps if (ts < later) But I don't see any reason why you should ever get an invalid iterator for the I can't actually debug this with my current setup. But if changing the successful termination test doesn't fix things, I'd put in some traces in the |
AFAIK, the crash is always when processed() function is called at line 243. So, I am not sure if the test at line 234 has any consequences. I agree that the iterators 'from' and 'to' are both valid in processed() function. But what if unprocessed.erase() function never manages to reach 'to' iterator, starting from 'from' iterator? It might be reaching the end() iterator, and its dereference could be causing the bug. I think that's what is happening in cases when we get the error, because the change that @gkuppe tried, seems to work. @gkuppe can you please post the change that you tried in the processed() function to avoid this bug? |
First off, "avoiding" the bug is easy. Just have The test at line 234, on the other hand, will affect correctness. It's the only place where As for the scenario you describe, note that Which, come to think of it is exactly what's happening! Since we're not correctly short-circuiting at line 234, we're missing the timestamp in the range, and |
Looking at the code, there's one more change I'd like to make. These maps are thread-local, and the only thing that appears to put a To be safe, note(*_current_timestamp); If we've already seen it, the test at line 148 will short-circuit it with no work being done. Otherwise, we'll be sure we're up-to-date as of the call to |
So if I understand you correct, you are saying to make following 2 changes and test again:
Is that right? |
Yeah. It's possible that it should be
|
OK. @gkuppe is going to get these two changes tested, and then inform here. Thanks. |
Ok, I'm testing. I had to move some externs declaration above in order to allow recognizing the "_current_timestamp" variable.
|
That "workaround" will certainly make this crash disappear, but it does it by masking the real problem and erasing the wrong elements from the map. I would recommend removing it. The fix I recommended should (I hope) solve the problem, but if it doesn't the workaround will prevent you from seeing it. |
Problem reproduced with proposed fix: Here is the patch I used:
|
When running java with MDS framework for a long period, java process terminates with the following message in the stdout of the console that launched the application:
*** Error in `/usr/bin/java': free(): invalid pointer: 0x00007fe6580016c0 ***
There is no log message about this problem.
After further investigation debugging with gdb, we noticed the free is called from gc code from MDS.
The text was updated successfully, but these errors were encountered: