Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Android,Fabric,bridge-mode): patch crash with context detached from activity #2276

Merged
merged 8 commits into from
Aug 5, 2024

Conversation

kkafar
Copy link
Member

@kkafar kkafar commented Jul 31, 2024

Description

When running on Fabric + "bridgefull" combination, react packages are initialised very early in application lifetime (much earlier than in Fabric + bridgeless), when there is no activity injected to ReactContext yet. Thus when creating packages they do not have access to the activity and we relied on this access up to this moment to setup ScreenDummyLayoutHelper object, throwing exception if activity was not available.

Changes

This diff delays initialisation of the dummy layout in case there is no access to activity when creating the object up to the point of invocation of onHostResume callbacks of ReactContext (this is the very moment activity is injected into the context), avoiding the crash. There is also fallback mechanism added in computeDummyLayout method, so that when accessed from C++ layer it has another chance of initialising the dummy layout, before performing computations. If it fails (see below for reasons why it might fail) 0f is returned as header height causing content jump, but not crashing the application.

Important

Most likely there is a race condition in this code. onHostResume is called from UI thread at the execution point, when JS thread & first native render & thus commit & layout is already in progress on background thread. In case JS thread proceeds to layout & computeDummyLayout is called from our Screen component descriptor before onHostResume is called on UI thread & the dummy layout is initialised, we hit the case when computing header height will not be possible due to uninitialised dummy layout.
I failed to trigger this behaviour even once for ~30 min of testing with trying to put UI thread to sleep etc., however I've also failed to find a proof that it won't happen because of some synchronisation / execution order.

What's also important is that there is no good way to synchronise these threads, because it is not the matter of exclusive access to some critical section, but rather a order of access between UI & background thread. Some barrier mechanism would be required here, however we do not have thread references accessible from our code.

Note

One possible solution would be to synchronise access to maybeInitDummyLayoutWithHeader method & in case the background thread hits it before UI, we could wait for UI in some kind of loop ("slow spinlock"). This could guarantee correctness, however it is crazy bad, because we would impede whole render process for possibly long time. Despite the flaws of this approach this might be something to consider for the future.

Caution

One more thing to note is that I rely on JVM atomicity guarantees when reading / writing isDummyLayoutInitialized variable. There is danger that both threads hit the layout initialisation code at the same time and thus leading to corruption. TODO: try to handle this case more gracefully. A lock for initialisation code should be enough.

Test code and steps to reproduce

Run FabricExample with load(bridgelessEnabled = false) set in MainApplication and see that it now works.

Checklist

  • Ensured that CI passes

@kkafar kkafar requested review from WoLewicki, alduzy and tboba August 2, 2024 14:05
@WoLewicki
Copy link
Member

this might be something to consider for the future.

The near (hopefully) future is only bridgeless mode available 😓

Copy link
Member

@WoLewicki WoLewicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kkafar kkafar merged commit 1713e8f into main Aug 5, 2024
5 checks passed
@kkafar kkafar deleted the @kkafar/fix-fabric-bridge-mode branch August 5, 2024 10:29
github-merge-queue bot referenced this pull request in valora-inc/wallet Sep 10, 2024
This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[react-native-screens](https://redirect.github.com/software-mansion/react-native-screens)
| [`^3.33.0` ->
`^3.34.0`](https://renovatebot.com/diffs/npm/react-native-screens/3.33.0/3.34.0)
|
[![age](https://developer.mend.io/api/mc/badges/age/npm/react-native-screens/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/npm/react-native-screens/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/npm/react-native-screens/3.33.0/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/npm/react-native-screens/3.33.0/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>software-mansion/react-native-screens
(react-native-screens)</summary>

###
[`v3.34.0`](https://redirect.github.com/software-mansion/react-native-screens/releases/tag/3.34.0)

[Compare
Source](https://redirect.github.com/software-mansion/react-native-screens/compare/3.33.0...3.34.0)

Recently released
[3.33.0](https://redirect.github.com/software-mansion/react-native-screens/releases/tag/3.33.0)
introduced a crash **when running on** Android + Fabric + "bridgefull"
combination of platform / architecture. This version introduces a fix
for that crash with changes in native code, thus bumping minor version.

#### What's Changed

#### 🐛 Bug fixes

- Android, Fabric, bridge-mode: patch crash with context detached from
activity by [@&#8203;kkafar](https://redirect.github.com/kkafar) in
[https://github.com/software-mansion/react-native-screens/pull/2276](https://redirect.github.com/software-mansion/react-native-screens/pull/2276)

#### 🔢 Miscellaneous

- Extend logging in architecture-integrity scripts & add NativeProxy.kt
to blacklist by [@&#8203;kkafar](https://redirect.github.com/kkafar) in
[https://github.com/software-mansion/react-native-screens/pull/2281](https://redirect.github.com/software-mansion/react-native-screens/pull/2281)
- Aggregate updates from dependabot by
[@&#8203;kkafar](https://redirect.github.com/kkafar) in
[https://github.com/software-mansion/react-native-screens/pull/2267](https://redirect.github.com/software-mansion/react-native-screens/pull/2267)

**Full Changelog**:
software-mansion/react-native-screens@3.33.0...3.34.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "after 5pm,every weekend" in timezone
America/Los_Angeles, Automerge - "after 5pm,every weekend" in timezone
America/Los_Angeles.

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/valora-inc/wallet).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsibnBtIiwicmVub3ZhdGUiXX0=-->

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: valora-bot <[email protected]>
Co-authored-by: Jean Regisser <[email protected]>
github-merge-queue bot referenced this pull request in valora-inc/wallet Sep 10, 2024
This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[react-native-screens](https://redirect.github.com/software-mansion/react-native-screens)
| [`^3.33.0` ->
`^3.34.0`](https://renovatebot.com/diffs/npm/react-native-screens/3.33.0/3.34.0)
|
[![age](https://developer.mend.io/api/mc/badges/age/npm/react-native-screens/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/npm/react-native-screens/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/npm/react-native-screens/3.33.0/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/npm/react-native-screens/3.33.0/3.34.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>software-mansion/react-native-screens
(react-native-screens)</summary>

###
[`v3.34.0`](https://redirect.github.com/software-mansion/react-native-screens/releases/tag/3.34.0)

[Compare
Source](https://redirect.github.com/software-mansion/react-native-screens/compare/3.33.0...3.34.0)

Recently released
[3.33.0](https://redirect.github.com/software-mansion/react-native-screens/releases/tag/3.33.0)
introduced a crash **when running on** Android + Fabric + "bridgefull"
combination of platform / architecture. This version introduces a fix
for that crash with changes in native code, thus bumping minor version.

#### What's Changed

#### 🐛 Bug fixes

- Android, Fabric, bridge-mode: patch crash with context detached from
activity by [@&#8203;kkafar](https://redirect.github.com/kkafar) in
[https://github.com/software-mansion/react-native-screens/pull/2276](https://redirect.github.com/software-mansion/react-native-screens/pull/2276)

#### 🔢 Miscellaneous

- Extend logging in architecture-integrity scripts & add NativeProxy.kt
to blacklist by [@&#8203;kkafar](https://redirect.github.com/kkafar) in
[https://github.com/software-mansion/react-native-screens/pull/2281](https://redirect.github.com/software-mansion/react-native-screens/pull/2281)
- Aggregate updates from dependabot by
[@&#8203;kkafar](https://redirect.github.com/kkafar) in
[https://github.com/software-mansion/react-native-screens/pull/2267](https://redirect.github.com/software-mansion/react-native-screens/pull/2267)

**Full Changelog**:
software-mansion/react-native-screens@3.33.0...3.34.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "after 5pm,every weekend" in timezone
America/Los_Angeles, Automerge - "after 5pm,every weekend" in timezone
America/Los_Angeles.

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/valora-inc/wallet).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsibnBtIiwicmVub3ZhdGUiXX0=-->

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: valora-bot <[email protected]>
Co-authored-by: Jean Regisser <[email protected]>
ja1ns pushed a commit to WiseOwlTech/react-native-screens that referenced this pull request Oct 9, 2024
…om activity (software-mansion#2276)

## Description

When running on Fabric + "bridgefull" combination, react packages are
initialised very early in application lifetime (much earlier than in
Fabric + bridgeless), when there is no activity injected to
`ReactContext` yet. Thus when creating packages they do not have access
to the activity and we relied on this access up to this moment to setup
`ScreenDummyLayoutHelper` object, throwing exception if activity was not
available.

## Changes

This diff delays initialisation of the dummy layout in case there is no
access to activity when creating the object up to the point of
invocation of `onHostResume` callbacks of `ReactContext` (this is the
very moment activity is injected into the context), avoiding the crash.
There is also fallback mechanism added in `computeDummyLayout` method,
so that when accessed from C++ layer it has another chance of
initialising the dummy layout, before performing computations. If it
fails (see below for reasons why it might fail) `0f` is returned as
header height causing content jump, but not crashing the application.

> [!important]
**Most likely** there is a race condition in this code. `onHostResume`
is called from UI thread at the execution point, when JS thread & first
native render & thus commit & layout is already in progress on
background thread. In case JS thread proceeds to layout &
`computeDummyLayout` is called from our Screen component descriptor
before `onHostResume` is called on UI thread & the dummy layout is
initialised, we hit the case when computing header height will not be
possible due to uninitialised dummy layout.
I failed to trigger this behaviour even once for ~30 min of testing with
trying to put UI thread to sleep etc., however I've also failed to find
a proof that it won't happen because of some synchronisation / execution
order.

What's also important is that there is no good way to synchronise these
threads, because it is not the matter of exclusive access to some
critical section, but rather a order of access between UI & background
thread. Some barrier mechanism would be required here, however we do not
have thread references accessible from our code.

> [!note] 
One possible solution would be to synchronise access to
`maybeInitDummyLayoutWithHeader` method & in case the background thread
hits it before UI, we could wait for UI in some kind of loop ("slow
spinlock"). This could guarantee correctness, however it is crazy bad,
because we would impede whole render process for possibly long time.
Despite the flaws of this approach this might be something to consider
for the future.

> [!caution]
One more thing to note is that I rely on [JVM atomicity
guarantees](https://docs.oracle.com/javase/tutorial/essential/concurrency/atomic.html)
when reading / writing `isDummyLayoutInitialized` variable. There is
danger that both threads hit the layout initialisation code at the same
time and thus leading to corruption. TODO: try to handle this case more
gracefully. A lock for initialisation code should be enough.

## Test code and steps to reproduce

Run `FabricExample` with `load(bridgelessEnabled = false)` set in
`MainApplication` and see that it now works.

## Checklist

- [x] Ensured that CI passes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants