-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve traffic shaping to reduce packet loss and video stuttering #3
base: master
Are you sure you want to change the base?
Conversation
Are these changes from the newest build, or are they even more recent changes? |
I know you've described it before, but could you explain the "magic numbers" a bit in the PR description or comments? :) |
Sorry shouldn't have added y'all before adding a proper description, added one. Also each commit has a detailed description as well. I anticipate tightening up the language and adding examples of both the issue and bandwidth graphs to the PR when I open it against OBS |
c7e9148
to
2ed3db4
Compare
This is based on WebRTC which choose a MTU of 1200. This should improve streaming over tunnels that would encapsulate packets and push them past the size routers could handle (often 1500). https://groups.google.com/g/discuss-webrtc/c/gH5ysR3SoZI/m/zrnVHqtUAwAJ This reduces the amount of bytes that can fit in the NACK buffer by about 14%, to account for this and keep behavior the same the NACK buffer has been increased to the next largest size, doubling it.
Keyframes cause a burst of traffic that may exceed the streamer's connection speed and thus cause excessive buffering by their router or another hop along the path to their streaming service. To counteract this FTL does traffic shaping/smoothing for video packets. However the implementation is a bit broken. It allows a peak kbps rate to be set but entirely ignores it and limits outgoing bandwidth to exactly the video bitrate, calculated over a running 100ms window. This has very poor behavior on mostly static streams with large keyframes. For example take a 5000kbps stream that has 200KB keyframes every two seconds. Because the send rate is limited to 63.5KB every 100ms window it will take 320ms to send the keyframe. But because the stream is mostly static, non-keyframes will be very small and send without being delayed. That is a huge amount of jitter for the low-latency streaming FTL is trying to support, and it happens even if the user's connections can support sending at a faster rate. Using the measured speed solves this. Looking the speedtest and peak bandwidth calculation code, I can only conclude using the peak kbps value is what the FTL dev's meant to do.
This matches WebRTC which uses 5ms as described in: https://tools.ietf.org/html/draft-ietf-rmcat-gcc-02#section-5 Now instead of sending a large burst and then waiting 100ms before starting to trickle out packets, we will send a very small burst and start to trickle packets after 5ms. This may result in medium size video frames taking longer to send but overall traffic will be smoother and raising the allowed peak send bandwidth should more than make up for the smoothing delay on networks that can handle it. This new behavior seems to perform better in all situations. If you have a fast connection your peak kbps will be much higher than the video kbps and only minimal smoothing will be applied. If you are streaming at close to the speed of your connection then significant smoothing will be applied to keep within your network's capability. Doing this smoothing at the application level is preferred with RTP over trusting the user's router and connection to handle large bursts of packets.
2ed3db4
to
51d5dcc
Compare
This distills the changes I've been testing that improve how outgoing video packets are paced and also brings in some tweaked values from the battle-tested WebRTC project. This does not fix the root cause of video stuttering, see Glimesh/janus-ftl-plugin#101 for more details on that issue.
Changes:
peak_kbps
value as was originally intended in this codeThe original goal of this change was to solve issues where some users reported 'stuttering' on streams where on every keyframe the stream would pause for a fraction of a second before skipping forward. In extreme cases the stutter can last for more than a second or video can drop out entirely.
I've since investigated further and confirmed the actual root cause of stutters is due to the behavior of the WebRTC stack on the client side, see Glimesh/janus-ftl-plugin#101 for details. I'd still like to come back to this change once that issue is addressed, I still think we can make some small changes here that reduce packet loss for some users with bad networks while also reducing latency for users with good networks.
See the individual commits for additional detail but as an overview, this centers on how FTL uses a token bucket to shape outgoing video packets. They call this the
transmit_level
. It fills at a rate ofbytes_per_ms
which is currently based onvideo->kbps
currently and the bucket holds up tobytes_per_ms * 100ms
. In practice this means for a 5000kbps target video bitrate you can send up to 62.5KB in a 100ms period before the video send thread will start sleeping to smooth out traffic.The main issue with this is that keyframes can be much larger than 62.5KB depending on your encoder settings and type of content you are streaming. I've observed keyframes of up to 250KB at a 5000kbps bitrate which based on the current smoothing will take 320ms to fully send the keyframe. That is a huge jump in latency for a streaming protocol like FTL that aims for sub-second latency.
In theory this should just show up as significant delay as buffering is done on the viewer side to handle this large packet delivery jitter. However we've seen specially on mostly static streams that only have one small part changing (such as a countdown timer) that the video will pause or "stutter" a very noticeable amount on nearly every keyframe. I've confirmed what is happening is the user's WebRTC stack is not expanding the jitter buffer enough to handle the keyframe delay, see Glimesh/janus-ftl-plugin#101 for details.
As a final note, this codebase is honestly not of the best quality and OBS I think is right to try and drop support for it as soon as an alternative low-latency protocol exists. In my view the best longer-term option remains getting WebRTC or other modern low-latency protocol supported in OBS.