background handler terminated by 0xdead10cc sometimes #1202

r10s · 2021-05-11T08:41:04Z

getting "Namespace RUNNINGBOARD, Code 0xdead10cc" crashes from time to time, i cannot reproduce that, but it was reported by @dignifiedquire; some logs are in App Store Connect, excerpt

Exception Type:  EXC_CRASH (SIGKILL)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note:  EXC_CORPSE_NOTIFY
Termination Reason: Namespace RUNNINGBOARD, Code 0xdead10cc
Triggered by Thread:  0

full log - so, there seems sth. wrong with background thread handling, searching for "Namespace RUNNINGBOARD, Code 0xdead10cc" brings up some advices, need to investigate further.

EDIT: to sum up, the issue with 0xdead10cc is that a lock is held on some files (probably sql-database) after the app ~~gets~~ signals to be suspended. reason might be dispatched jobs that are still running for a little moment, we tried to mitigate that by recent commits and to exclude this cause. however, might also be that "shared" usage of the database is the cause, that would be very hard to fix, see discussions wrt share-to-delta.

stats of recent versions:

compared with older stats, we had 700 installations and only 1 crash/7 days (at least after removing sqlx)

ftr, 0xdead10cc was also topic in #1057

The text was updated successfully, but these errors were encountered:

cyBerta · 2021-05-11T13:13:01Z

we should check the diff between 1.19.0 and 1.19.1

r10s · 2021-05-11T16:02:17Z

this is the diff between 1.19.0 and 1.19.1:
v1.19.0...v1.19.1
not sure, if the can be found in these few diff-lines. might be, the issue was also be there before and became somehow more visible. but the "fix a rare crash in chatlist" comment in changelog is suspicious at least 🧐

cyBerta · 2021-05-11T19:00:23Z

@r10s does maybeNetwork() v1.19.0...v1.19.1#diff-fc4377fbb4c60bfc9a11f53fd448de91bbf34787141509ddb85242515fea9e23R378 contain any database action?

There are two nice articles describing similar errors to what we see:
https://inessential.com/2020/02/13/how_we_fixed_the_dreaded_0xdead10cc_cras
https://blog.iconfactory.com/2019/08/the-curious-case-of-the-core-data-crash/

Somewhere we're keeping a file handle, probably the database while the app is about to be suspended.

Due to incorrectly placed task markers, the background task would be marked done before it had finished saving the database. When things happened in the wrong sequence, iOS would unceremoniously kill the process — making it look like it had crashed somewhere inside of Core Data.

This quote describes what might go wrong in our app, too:

we call maybe network from a background task and mark the background task some steps afterwards as completed: completionHandler(.newData). The async block is still executed in the meanwhile and iOS raises this error while suspending the app.

r10s · 2021-05-11T19:19:32Z

does maybeNetwork() contain any database action?

not sure, at a first glance, i did not find something, but might be, maybe some config or so. and even if not directly - it may still keep the handle open, if it takes longer for whatever reason.

maybe @link2xt knows better.

link2xt · 2021-05-11T20:15:06Z

dc_maybe_network returns immediately, you can verify it with repl, and I have also tested with kdeltachat. You can also add time measurements to iOS logs and verify that it is true for iOS.

@r10s said on irc:

we fixed a lot of anr by scheduling maybe_network in a bg thread

Not sure why this happens, all dc_maybe_network does is a try_send on the "interrupt" channels. If the channel is full, the operation should fail immediately.

From reading the links above, I get that the problem is unclosed database "connections" (file descriptors in case of SQLite) at the time of suspension. I see that you are trying to stop working by calling dc_stop_io before the deadline. But it does not close SQL connections. Core only closes connections during import/export and when the context is dropped (by calling dc_context_unref). dc_stop_io only prevents future network operations, but you can still use the database: read contact list, chatlist, messages, create new message sending jobs and so on. If you really need to close SQLite file descriptors, you need a separate call for this in the API, let's call it dc_context_suspend, followed by dc_context_unsuspend to reopen the database.

r10s · 2021-05-11T20:38:32Z

thanks for the insights, @link2xt !

@cyBerta we are sharing the database between extensions, or? we did not really have a choice iirc, however, is is trally what is discouraged in the links above? i thought about that always more as a "lib" doing core stuff, that is used by main-app and needed extensions.

also, we did not always get the error - or maybe it was just covered by other ones?

EDIT: i checked some older crash logs - it was also present there - but of course much rare as the background fetch was called less often. so, indeed, there are some hints that it is not really related to maybeNetwork() ...
and yes, maybe a dedicated close() or suspend() or so is the way to go.

r10s · 2021-05-12T07:52:21Z

thinking it over - even if maybeNetwork() returns immediately, it might be that it is started much later as expected.

we schedule maybeNetwork() call in a dispatch queue with a very low priority, so, might be, it is called even after we call the completionHandler(). may not happen always, but sometimes - but the issue is also sometimes only.

so, maybe instead of #1206 that schedules everything, just call maybeNetwork in the given thread as it was done before 1.19.1. i would not expect that the issue will be gone completely, but maybe it will be much less, on the level as before.

even if it does not help, we would use maybeNetwork() on-purpose, and will notice if that really somehow results in more anr. and, in this case, we can think over a dc_suspend() api as suggested by @link2xt

nb: i am wondering if the crash is actually user-visible when not using testflight or a debugger. so, maybe, some mitigating is good enough here.

r10s · 2021-05-12T21:11:37Z

some more deep information: https://developer.apple.com/forums/thread/126438, https://developer.apple.com/forums/thread/655225

EDIT: and here is the official apple doc, took me a moment to find it: https://developer.apple.com/documentation/xcode/understanding-the-exception-types-in-a-crash-report#EXC_CRASH-(SIGKILL)

r10s · 2021-05-13T10:35:54Z

the issue is not fixed by #1208, maybe mitigated.

for getting the error popup: you have to install from testflight.

i never got the exception-popup when installing from xcode. instead i am getting the launch screen maybe once a day on starting, that might be how 0xdead10cc gets visible without testflight.

i am wondering, if the issue is user visible when installing from regular app store; if not, maybe it is okay to release with the issue for now, unless we find an easy way to fix it.

cyBerta · 2021-05-13T11:04:52Z

some more deep information: https://developer.apple.com/forums/thread/126438

also some explanation how GRDB mitigates the issue on the user API level: https://github.com/groue/GRDB.swift/blob/master/Documentation/SharingADatabase.md#how-to-limit-the-0xdead10cc-exception

cyBerta · 2021-05-13T11:30:56Z

If I understand the issue correctly, our problem is related to the fact that we share the database between the share extension and main app. In this case database locks are created for syncing. Both share extension and main app run as separate processes.

Another way to mitigate the issue is described here: https://inessential.com/2020/02/13/how_we_fixed_the_dreaded_0xdead10cc_cras

move the database back to the main app
use a (plist) file in the shared container the extension writes which the app later reads if the app resumes

I think this will be very cumbersome or impossible for us to implement though, because the extension needs data from the database to show the chat list, contacts etc. and to send messages.

r10s · 2021-05-13T16:33:42Z

our problem is related to the fact that we share the database between the share extension

yip, some reports point in that direction, while others do not. what i do not get: if it is really related to the database moved to shared spaces: (a) i would expect the issue to appear far more often, not only once a day or so (b) what would the shared space be good for then? :)

r10s · 2021-05-13T19:09:43Z

with latest testflight, there may be a new issue, where 0xdead10cc happens, on startup, reported by @dignifiedquire - and sth. similar happens also to me when running testflight.

the new issue may be related to the fix-markseen-forground-fix, #1182 - this is the log, that points to that direction, https://gist.github.com/r10s/f93220a734a8a3e26705fed996356283, line 48, startTimer. ~~i do not really get that, the app is coming to forground, why do we get a 0xdead10cc here. in any case, that is new.~~ EDIT: it is the fired timer, that causes the issue, that, indeed makes some sense if it is not stopped for whatever reason. but i still do not get it completely.

r10s · 2021-05-15T13:49:52Z

0xdead10cc is still there in 1.20.3, here is a recent log with a different stack trace: https://gist.github.com/r10s/ffd99677f5b6e8f1d62ae4d82238311b, we'll see if at least the new coming-to-foreground-issue is fixed and if the number of exceptions are lowered by 1.20.3 in general (i personally had none so far)

r10s · 2021-05-17T10:22:51Z

another idea: may it be that we have to do beginBackgroundTask/endBackgroundTask also in performFetchWithCompletionHandler and didReceiveRemoteNotification? as we return from the function and call the completionHandler a bit later?

https://stackoverflow.com/questions/37321498/background-task-on-remote-notification-suspended-after-a-short-while points in this direction, but i did not find anything "official" yet. maybe check some other open source apps.

cyBerta · 2021-05-17T10:32:28Z

another idea: may it be that we have to do beginBackgroundTask/endBackgroundTask also in performFetchWithCompletionHandler and didReceiveRemoteNotification? as we return from the function and call the completionHandler a bit later?

https://github.com/deltachat/deltachat-ios/issues?utf8=✓&q=is%3Aissue+sort%3Aupdated-desc points in this direction, but i did not find anything "official" yet. maybe check some other open source apps.

the link leads to the issue overview. Is that where you wanted to point to?

r10s · 2021-05-17T10:56:08Z

i updated the link :)

cyBerta · 2021-05-17T12:01:08Z

Regarding our questions on the last meeting:

you can call UIApplication.shared.beginBackgroundTask multiple times, but you need to keep track of them and correctly call UIApplication.shared.endBackgroundTask for each returned UIBackgroundTaskIdentifier instance.

from https://developer.apple.com/documentation/uikit/uiapplication/1623031-beginbackgroundtask:

You can call this method at any point in your app’s execution. You may also call this method multiple times to mark the beginning of several background tasks that run in parallel. However, each task must be ended separately. You identify a given task using the value returned by this method.

To assist with debugging, this method generates a name for the task, based on the name of the calling method or function. If you want to specify a custom name, use the beginBackgroundTask(withName:expirationHandler:) method instead.

This method can be safely called on a non-main thread.

cyBerta · 2021-05-17T12:32:23Z

How many background tasks can be called:

from https://developer.apple.com/forums/thread/85066

Do not create too many background tasks. How many is too many? It’s absolutely fine to create tens of background tasks but creating thousands is not a good idea.

IMPORTANT iOS 11 introduced a hard limit on the number of background task assertions a process can have (currently about 1000, but the specific value may change in the future). If you see a crash report with the exception code 0xbada5e47, you’ve hit that limit.

Note The practical limit that you’re most likely to see here is the time taken to call your expiry handlers. The watchdog has a strict limit (a few seconds) on the total amount of time taken to run background task expiry handlers. If you have thousands of handlers, you may well run into this limit.

r10s · 2022-04-14T17:06:37Z

0xdead10cc crashes in 1.28.1 are at typically at the following line: https://github.com/deltachat/deltachat-ios/blob/v1.28.1/deltachat-ios/AppDelegate.swift#L460

meanwhile, however, the code is different, also, these crashes seem to be replaced by a "force suspend" when being not in testflight, see above.

but we are still in the same code are and i still do not full get the too long times from #1461 (we meanwhile just stop the last 16 wakeups - all of them should be around 11 seconds, however, some are 10 minutes or longer, so there is a wakeup in between):

we are writing the times at https://github.com/deltachat/deltachat-ios/blob/master/deltachat-ios/AppDelegate.swift#L474 before calling the completion handler - so we are put to suspended-mode before
@link2xt may it happen that stopIo() takes more than 20 seconds sometimes? that would explain the suspension before being able to log the time.

also @link2xt : your suggestion from above, a dc_accounts_suspend() - how much effort would that be? not sure if that would totally fix the issue, however, it still seems the right way to go.

r10s · 2022-04-16T22:05:02Z

ftr: a maybe somehow related crosslink: deltachat/deltachat-core-rust#2955 (comment)

link2xt · 2022-04-16T23:48:33Z

With current API it is possible to introduce dc_context_close with minimal changes. This is making things worse wrt reducing statefullness of context though, see deltachat/deltachat-core-rust#3063).

r10s · 2022-04-17T14:12:27Z

thanks for the information, @link2xt! we are currently on checking/fixing some other parts and bugs (#1541 #1542), let's see after that if dc_context_close is needed at the end. maybe things shape nicely even without :)

r10s · 2022-05-07T15:45:10Z

reopening, the issue is still there on testflight (not on debug-build, not on regular release).

i am running 1.30.1 from testflight on an iphone7 that got these errors before. well, up to no, 30+ hours later, i did not encountered any crashes. however, after all, i stay skeptical 🧐

link2xt · 2023-02-17T22:07:49Z

The database is going to be closed more reliably with deltachat/deltachat-core-rust#4053

r10s · 2023-02-17T22:57:23Z

nice, looking forward to try that out and if it helps on testflight builds

dignifiedquire · 2023-02-18T15:30:03Z

lets hope for the best, currently getting about 10 crashes per day..

r10s · 2023-02-18T15:54:04Z

on testflight? or otherwise debug?

dignifiedquire · 2023-02-18T16:54:45Z

on testflight

r10s · 2023-08-09T08:49:27Z

the issue is not completely fixed, however, still seems to appear on testflight only.

we probably need a way to really suspend the core, the theory is still that core accesses some files a moment after we say "okay, apple, ready to suspend".

link2xt · 2023-08-11T21:28:54Z

Currently the core maintains an SQL connection pool for each Context that is never closed. Context contains Sql and Sql has a .close() method, but it is private and is never called except when .open() fails and in the tests.

If we want to close Sql and all the connections in the connection pool, we need an API to close the Context that will in turn close the underlying database. Alternatively, to avoid making the Context stateful, we can drop the Context and reopen it from scratch in an account manager.

It is also technically possible to destroy async tokio Runtime which is currently lazily created by the deltachat-ffi, but it is probably not needed and would take additional effort, closing the database is the first thing to do.

Simon-Laux · 2023-11-15T02:40:04Z

I think this is related to encrypted accounts and to the new lock on the account manager.

r10s · 2023-12-06T15:37:28Z

I think this is related to encrypted accounts and to the new lock on the account manager.

+1, this is tracked by #2010 and #2009 now

if we find out more reasons for 0xdead10cc, we should file more on-point issues

r10s added the bug label May 11, 2021

cyBerta self-assigned this May 11, 2021

cyBerta mentioned this issue May 11, 2021

[WIP] fix background fetch crash #1206

Closed

r10s added the notify label May 11, 2021

r10s mentioned this issue May 11, 2021

[WIP] close database on suspend #1207

Closed

r10s mentioned this issue May 12, 2021

Revert "just never call maybeNetwork() from ui thread" #1208

Merged

r10s changed the title ~~background crashes from time to time~~ background handler terminated by 0xdead10cc from time to time May 12, 2021

r10s changed the title ~~background handler terminated by 0xdead10cc from time to time~~ background handler terminated by 0xdead10cc sometimes May 12, 2021

r10s mentioned this issue May 13, 2021

hack around 0xdead10cc #1210

Merged

cyBerta mentioned this issue May 14, 2021

Always set timer to nil after being invalidated #1213

Merged

r10s mentioned this issue May 14, 2021

prepare 1.20.3 #1214

Merged

r10s reopened this Mar 27, 2022

r10s removed the notify label Mar 27, 2022

r10s unassigned cyBerta May 6, 2022

r10s closed this as completed Jun 20, 2022

link2xt mentioned this issue Feb 19, 2023

Release 1.109.0 deltachat/deltachat-core-rust#4060

Merged

This was referenced Feb 24, 2023

prep 1.35.0 testflight ✈️ #1816

Merged

Draft: add basic ChatView UIViewController #1810

Closed

link2xt mentioned this issue Mar 18, 2023

Compress mime_headers column with HTML emails stored in database (#4077) deltachat/deltachat-core-rust#4129

Merged

r10s reopened this Aug 9, 2023

This was referenced Nov 15, 2023

Notifications: Crash in background activity -> File Locks #1979

Closed

try to fix notifications #1969

Closed

Simon-Laux added the notify label Nov 16, 2023

r10s mentioned this issue Dec 4, 2023

feat!: Add lockfile to account manager (#4310) deltachat/deltachat-core-rust#4314

Merged

r10s closed this as completed Dec 6, 2023

r10s closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2023

r10s mentioned this issue Dec 23, 2023

shorten notification setup, fix potential crashes #2029

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

background handler terminated by 0xdead10cc sometimes #1202

background handler terminated by 0xdead10cc sometimes #1202

r10s commented May 11, 2021 •

edited

Loading

cyBerta commented May 11, 2021

r10s commented May 11, 2021 •

edited

Loading

cyBerta commented May 11, 2021 •

edited

Loading

r10s commented May 11, 2021 •

edited

Loading

link2xt commented May 11, 2021

r10s commented May 11, 2021 •

edited

Loading

r10s commented May 12, 2021 •

edited

Loading

r10s commented May 12, 2021 •

edited

Loading

r10s commented May 13, 2021 •

edited

Loading

cyBerta commented May 13, 2021

cyBerta commented May 13, 2021

r10s commented May 13, 2021 •

edited

Loading

r10s commented May 13, 2021 •

edited

Loading

r10s commented May 15, 2021

r10s commented May 17, 2021 •

edited

Loading

cyBerta commented May 17, 2021

r10s commented May 17, 2021

cyBerta commented May 17, 2021 •

edited

Loading

cyBerta commented May 17, 2021

r10s commented Apr 14, 2022

r10s commented Apr 16, 2022

link2xt commented Apr 16, 2022

r10s commented Apr 17, 2022 •

edited

Loading

r10s commented May 7, 2022

link2xt commented Feb 17, 2023

r10s commented Feb 17, 2023

dignifiedquire commented Feb 18, 2023

r10s commented Feb 18, 2023

dignifiedquire commented Feb 18, 2023

r10s commented Aug 9, 2023

link2xt commented Aug 11, 2023

Simon-Laux commented Nov 15, 2023

r10s commented Dec 6, 2023 •

edited

Loading

background handler terminated by 0xdead10cc sometimes #1202

background handler terminated by 0xdead10cc sometimes #1202

Comments

r10s commented May 11, 2021 • edited Loading

cyBerta commented May 11, 2021

r10s commented May 11, 2021 • edited Loading

cyBerta commented May 11, 2021 • edited Loading

r10s commented May 11, 2021 • edited Loading

link2xt commented May 11, 2021

r10s commented May 11, 2021 • edited Loading

r10s commented May 12, 2021 • edited Loading

r10s commented May 12, 2021 • edited Loading

r10s commented May 13, 2021 • edited Loading

cyBerta commented May 13, 2021

cyBerta commented May 13, 2021

r10s commented May 13, 2021 • edited Loading

r10s commented May 13, 2021 • edited Loading

r10s commented May 15, 2021

r10s commented May 17, 2021 • edited Loading

cyBerta commented May 17, 2021

r10s commented May 17, 2021

cyBerta commented May 17, 2021 • edited Loading

cyBerta commented May 17, 2021

r10s commented Apr 14, 2022

r10s commented Apr 16, 2022

link2xt commented Apr 16, 2022

r10s commented Apr 17, 2022 • edited Loading

r10s commented May 7, 2022

link2xt commented Feb 17, 2023

r10s commented Feb 17, 2023

dignifiedquire commented Feb 18, 2023

r10s commented Feb 18, 2023

dignifiedquire commented Feb 18, 2023

r10s commented Aug 9, 2023

link2xt commented Aug 11, 2023

Simon-Laux commented Nov 15, 2023

r10s commented Dec 6, 2023 • edited Loading

r10s commented May 11, 2021 •

edited

Loading

r10s commented May 11, 2021 •

edited

Loading

cyBerta commented May 11, 2021 •

edited

Loading

r10s commented May 11, 2021 •

edited

Loading

r10s commented May 11, 2021 •

edited

Loading

r10s commented May 12, 2021 •

edited

Loading

r10s commented May 12, 2021 •

edited

Loading

r10s commented May 13, 2021 •

edited

Loading

r10s commented May 13, 2021 •

edited

Loading

r10s commented May 13, 2021 •

edited

Loading

r10s commented May 17, 2021 •

edited

Loading

cyBerta commented May 17, 2021 •

edited

Loading

r10s commented Apr 17, 2022 •

edited

Loading

r10s commented Dec 6, 2023 •

edited

Loading