Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android ANR from Mono GC #9857

Open
bcasari-orbit opened this issue Feb 27, 2025 · 0 comments
Open

Android ANR from Mono GC #9857

bcasari-orbit opened this issue Feb 27, 2025 · 0 comments
Assignees
Labels
Area: App Runtime Issues in `libmonodroid.so`.

Comments

@bcasari-orbit
Copy link

bcasari-orbit commented Feb 27, 2025

Android framework version

net8.0-android

Affected platform version

VS 2022 17.12.5, Android Target SDK 34, Target Framework net8.0-android

Description

After upgrading to .NET 8 Android, from Xamarin.Android, we have noticed a significant increase in Android ANRs. The ANRs occur randomly, and we have noticed from Crashlytics that the ANR origin comes from Mono Garbage Collector. Here is a stacktrace sample:

pp.android.prod (native):tid=26481 systid=26481 
#00 pc 0x79a7bc libart.so (__aarch64_cas4_rel + 44) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#01 pc 0x2037c4 libart.so (art::gc::collector::ConcurrentCopying::Copy(art::Thread*, art::mirror::Object*, art::mirror::Object*, art::MemberOffset) + 296) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#02 pc 0x20018c libart.so (void art::mirror::Object::VisitReferences<true, (art::VerifyObjectFlags)0, (art::ReadBarrierOption)1, art::gc::collector::ConcurrentCopying::RefFieldsVisitor<false>, art::gc::collector::ConcurrentCopying::RefFieldsVisitor<false> >(art::gc::collector::ConcurrentCopying::RefFieldsVisitor<false> const&, art::gc::collector::ConcurrentCopying::RefFieldsVisitor<false> const&) + 396) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#03 pc 0x64f938 libart.so (art::gc::collector::ConcurrentCopying::ProcessMarkStack() + 2156) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#04 pc 0x5629bc libart.so (art::gc::collector::ConcurrentCopying::CopyingPhase() + 1836) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#05 pc 0x5c0cb4 libart.so (art::gc::collector::ConcurrentCopying::RunPhases() + 2608) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#06 pc 0x587d90 libart.so (art::gc::collector::GarbageCollector::Run(art::gc::GcCause, bool) + 324) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#07 pc 0x437560 libart.so (art::gc::Heap::CollectGarbageInternal(art::gc::collector::GcType, art::gc::GcCause, bool, unsigned int) + 564) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#08 pc 0x5453e0 boot.oat (art_jni_trampoline + 112)
#09 pc 0x713aa8 boot.oat (java.lang.Runtime.gc + 168)
#10 pc 0x368774 libart.so (art_quick_invoke_stub + 612) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#11 pc 0x367148 libart.so (art::JValue art::InvokeVirtualOrInterfaceWithVarArgs<art::ArtMethod*>(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, art::ArtMethod*, std::__va_list) + 812) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#12 pc 0x72b24c libart.so (art::JNI<false>::CallVoidMethodV(_JNIEnv*, _jobject*, _jmethodID*, std::__va_list) + 192) (BuildId: 629e0ffca501d809c29dbbeef2f512d3)
#13 pc 0x31a78 libmonodroid.so (_JNIEnv::CallVoidMethod(_jobject*, _jmethodID*, ...) + 116) (BuildId: 25124644814034151c3278bff793a560d1119b0b)
#14 pc 0x30164 libmonodroid.so (xamarin::android::internal::OSBridge::gc_cross_references(int, MonoGCBridgeSCC**, int, MonoGCBridgeXRef*) + 224) (BuildId: 25124644814034151c3278bff793a560d1119b0b)
#15 pc 0x29e188 libmonosgen-2.0.so (BuildId: c7db5d00fcd3ecbca8fb4a5d7087a9c5b3c4f43f)
#16 pc 0x2c5798 libmonosgen-2.0.so (BuildId: c7db5d00fcd3ecbca8fb4a5d7087a9c5b3c4f43f)
#17 pc 0x2c2844 libmonosgen-2.0.so (BuildId: c7db5d00fcd3ecbca8fb4a5d7087a9c5b3c4f43f)
#18 pc 0x2c36fc libmonosgen-2.0.so (BuildId: c7db5d00fcd3ecbca8fb4a5d7087a9c5b3c4f43f)
#19 pc 0x2a929c libmonosgen-2.0.so (mono_gc_collect + 44) (BuildId: c7db5d00fcd3ecbca8fb4a5d7087a9c5b3c4f43f)
#20 pc 0xe57c__

We have developed an automated test to recreate the issue. We are able to reproduce the problem, after the app has been used extensively. So we actually developed a stress test that runs actions in the UI repeatedly. Eventually the issue occurs.

The symptom the user sees in the UI is that the app freezes (randomly) and then crashes.

We have made several updates to the app to try to fix it, but nothing has helped, including calling GC.Collect() regularly. We attempted also to update MONO_GC_PARAMS (with various configuration changes), with no help. For example:

MONO_GC_PARAMS=bridge-implementation=tarjan,nursery-size=64m,soft-heap-limit=512m,major=marksweep-conc-par,minor=simple-par,concurrent-sweep,mode=pause:150
MONO_THREADS_SUSPEND=preemptive

We have attempted to use the bridge-implementation tarjan, new and old. No changes.

Logcat logs the following error when the problem occurs:

46080 outstanding GREFs. Performing a full GC!
46081 outstanding GREFs. Performing a full GC!

Then the app freezes and crashes.

We have profiled the app and noticed something interesting. It appears that the Mono GC collects the dotnet Objects, but not the Java ones. For example, we profile the app with dotnet GCDUMP and see the following:

Managed Memory | Count
MonitorOverviewView (Fragment) 1

In Android Studio, when we make a native GCDUMP we see the following:

Class | Allocations
MonitorOverviewView (Fragment) 4

We have also noticed the following in the dotnet GCDUMP, comparing 2 distinct GCdumps extracted at different run times, which should not have increased the number of active Fragments in the app:

Image

It appears that the dotnet Object is Garbage Collected, but not the native (Java) Object. How can this be?

We have also tried to update the app to .NET 9 and the problem still persists.

Steps to Reproduce

  1. Run an app and recreate Fragments repeatedly
  2. Eventually MONO GC kicks in and app freezes

The issue occurs randomly.

Did you find any workaround?

No workaround found.

Relevant log output

monodroid-gc            46080 outstanding GREFs. Performing a full GC!
monodroid-gc            android  I  46081 outstanding GREFs. Performing a full GC!
viceapp.android         android  I  Explicit concurrent copying GC freed 242KB AllocSpace bytes, 2(264KB) LOS objects, 12% free, 165MB/189MB, paused 208us,141us total 1.539s
location.nsflp2         com.sec.location.nsflp2              W  Reducing the number of considered missed Gc histogram windows from 250 to 100
viceapp.android         android  I  Explicit concurrent copying GC freed 56KB AllocSpace bytes, 0(0B) LOS objects, 12% free, 165MB/189MB, paused 204us,172us total 1.539s
monodroid-gc            android  I  46080 outstanding GREFs. Performing a full GC!
roid.da.daagent         com.samsung.android.da.daagent       I  Using CollectorTypeCC GC.
system_server           system_server                        I  Background concurrent copying GC freed 21MB AllocSpace bytes, 120(3428KB) LOS objects, 18% free, 104MB/128MB, paused 1.274ms,612us total 524.076ms
msung.klmsagent         com.samsung.klmsagent                I  Using CollectorTypeCC GC.
@bcasari-orbit bcasari-orbit added Area: App Runtime Issues in `libmonodroid.so`. needs-triage Issues that need to be assigned. labels Feb 27, 2025
@jpobst jpobst removed the needs-triage Issues that need to be assigned. label Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: App Runtime Issues in `libmonodroid.so`.
Projects
None yet
Development

No branches or pull requests

3 participants