Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.x] PC not closing server side on normal hangup #3430

Closed
adnanel opened this issue Sep 19, 2024 · 5 comments
Closed

[1.x] PC not closing server side on normal hangup #3430

adnanel opened this issue Sep 19, 2024 · 5 comments
Labels
multistream Related to Janus 1.x

Comments

@adnanel
Copy link
Contributor

adnanel commented Sep 19, 2024

What version of Janus is this happening on?
Newest master, e.g. 504daf5aef333d6f37e41c30b00be24cfb6c83bf

Have you tested a more recent version of Janus too?
Yes, master branch is affected.

Was this working before?
Yes, this was broken with the change in this commit:
0f32c32

Additional context
Given a session with janus SIP plugin:

  • We send "hangup" request, but don't close the PC on client side.
  • Janus sets the session as closing
  • The SIP Relay thread loop terminates, and calls janus_sip_media_cleanup
  • Inside this method, has_audio and has_video are both set to FALSE
  • After this janus_sip_sofia_callback is invoked with an nua_i_state event, but the PC doesnt close because has_audio and has_video are both FALSE

Result: PC remains open, after a while we receive DTLS alert which causes PC closure.

@adnanel adnanel added the multistream Related to Janus 1.x label Sep 19, 2024
@lminiero
Copy link
Member

  • We send "hangup" request, but don't close the PC on client side

Yours is a good analysis, but why aren't you closing the PC on the client side too when sending the "hangup"? Our SIP demo does, and I would have expected everyone to do the same.

@adnanel
Copy link
Contributor Author

adnanel commented Sep 19, 2024

That's an option, I guess. If you think this behaviour is fine we can close the peer connections client side sooner than before.

Currently we do cleanup once we receive the hangup event back from janus (the "janus" : "hangup", not the SIP plugin hangup). I agree that there is no need for this to be done sequentially, but I'm not a big fan of relying on client side for this flow to finish as expected.

@lminiero
Copy link
Member

lminiero commented Sep 19, 2024

Makes sense. Starting from the assumption that I'm not going to revert the PR/commit you mentioned (which had a much serious impact on the status of sessions), I think the main issue here is related to timing and the order of things happening, that in this specific case lead to an internal cleanup in the plugin (janus_sip_hangup_media_internal) but not to a cleanup in the core (close_pc) that would be needed in this case though.

Rather than overcomplicate things, maybe there's an easier fix: in your last bullet point, always call both close_pc and janus_sip_hangup_media_internal. In fact, in cases where a PC was available, close_pc will schedule a call to hangup_media on the plugin, which in turn will call janus_sip_hangup_media_internal: if there was no PC, it won't, and so we have to do it ourselves (main reason why we made that patch you mentioned). Considering that janus_sip_hangup_media only calls janus_sip_hangup_media_internal protected by a mutex, calling it twice shouldn't be an issue: the first call (whether it's our own internal call, or the one scheduled by close_pc) will clean up things internally, and the second will do nothing since the state will have been changed by the call before.

Can you try changing this block here:

if(g_atomic_int_get(&session->establishing) || g_atomic_int_get(&session->established)) {
	if(session->media.has_audio || session->media.has_video) {
		/* Get rid of the PeerConnection in the core */
		gateway->close_pc(session->handle);
	} else {
		/* No SDP was exchanged, just clean up locally */
		janus_sip_hangup_media_internal(session->handle);
	}
}

to something like this instead

if(g_atomic_int_get(&session->establishing) || g_atomic_int_get(&session->established)) {
	/* Get rid of the PeerConnection in the core */
	gateway->close_pc(session->handle);
	/* Also clean up locally, in case there was no PC */
	janus_sip_hangup_media_internal(session->handle);
}

and let me know how that works for you? If I'm right, it should address your issue and at the same time not introduce any regression (due to the idempotent nature of janus_sip_hangup_media_internal), but it's a good idea to check if I'm missing anything on the top of my head.

@adnanel
Copy link
Contributor Author

adnanel commented Sep 19, 2024

That seems to have fixed our problem, I did a few tests myself and kept a single janus instance running automated tests for the past ~8 hours and no other problems were observed.

Do you want to commit that to master directly or should I create a PR?

@lminiero
Copy link
Member

No need, I'll push the commit myself to both master and 0.x shortly.
Thanks for the feedback!

natikaltura pushed a commit to natikaltura/janus-gateway that referenced this issue Nov 7, 2024
mwalbeck pushed a commit to mwalbeck/docker-janus-gateway that referenced this issue Nov 28, 2024
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [meetecho/janus-gateway](https://github.com/meetecho/janus-gateway) | minor | `v1.2.4` -> `v1.3.0` |

---

### Release Notes

<details>
<summary>meetecho/janus-gateway (meetecho/janus-gateway)</summary>

### [`v1.3.0`](https://github.com/meetecho/janus-gateway/blob/HEAD/CHANGELOG.md#v130---2024-11-25)

[Compare Source](meetecho/janus-gateway@v1.2.4...v1.3.0)

-   Refactored logging internals \[[PR-3428](meetecho/janus-gateway#3428)]
-   Use strtok to parse SDPs \[[PR-3424](meetecho/janus-gateway#3424)]
-   Fixed rare condition that could lead to a deadlock in the VideoRoom \[[PR-3446](meetecho/janus-gateway#3446)]
-   Fixed broken switch when using remote publishers in VideoRoom \[[PR-3447](meetecho/janus-gateway#3447)]
-   Added SRTP support to VideoRoom remote publishers (thanks [@&#8203;spscream](https://github.com/spscream)!) \[[PR-3449](meetecho/janus-gateway#3449)]
-   Added support for generic JSON metadata to VideoRoom publishers (thanks [@&#8203;spscream](https://github.com/spscream)!) \[[PR-3467](meetecho/janus-gateway#3467)]
-   Fixed deadlock in VideoRoom when failing to open a socket for a new RTP forwarder (thanks [@&#8203;spscream](https://github.com/spscream)!) \[[PR-3468](meetecho/janus-gateway#3468)]
-   Fixed deadlock in VideoRoom caused by reverse ordering of mutex locks \[[PR-3474](meetecho/janus-gateway#3474)]
-   Fixed memory leaks when using remote publishers in VideoRoom \[[PR-3475](meetecho/janus-gateway#3475)]
-   Diluted frequency of PLI in the VideoRoom (thanks [@&#8203;natikaltura](https://github.com/natikaltura)!) \[[PR-3423](meetecho/janus-gateway#3423)]
-   Better cleanup after failed mountpoint creations in Streaming plugin \[[PR-3465](meetecho/janus-gateway#3465)]
-   Fixed compilation of AudioBridge in case libogg isn't available (thanks [@&#8203;tmatth](https://github.com/tmatth)!) \[[PR-3438](meetecho/janus-gateway#3438)]
-   Better management of call cleanup in SIP plugin \[[Issue-3430](meetecho/janus-gateway#3430)]
-   Change the way call-IDs are tracked in the SIP plugin (thanks WebTrit!) \[[PR-3443](meetecho/janus-gateway#3443)]
-   Increased maximum size of custom SIP headers \[[Issue-3459](meetecho/janus-gateway#3459)]
-   Other smaller fixes and improvements (thanks to all who contributed pull requests and reported issues!)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4yOC4wIiwidXBkYXRlZEluVmVyIjoiMzkuMjguMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciIsImxhYmVscyI6W119-->

Reviewed-on: https://git.walbeck.it/walbeck-it/docker-janus-gateway/pulls/157
Co-authored-by: renovate-bot <[email protected]>
Co-committed-by: renovate-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multistream Related to Janus 1.x
Projects
None yet
Development

No branches or pull requests

2 participants