Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve audio codec selection with multi-channel audio codecs #1013

Closed
avelad opened this issue Sep 7, 2017 · 26 comments
Closed

Improve audio codec selection with multi-channel audio codecs #1013

avelad opened this issue Sep 7, 2017 · 26 comments
Assignees
Labels
status: archived Archived and locked; will not be updated type: enhancement New feature or request
Milestone

Comments

@avelad
Copy link
Member

avelad commented Sep 7, 2017

From: https://groups.google.com/forum/#!topic/shaka-player-users/hwfrJ5TiK78

Currently, when multiple codecs are presented in the manifest and are supported by the browser, Shaka Player chooses the "most efficient" codec by bitrate. This makes good sense for video, I think, but may make less sense for audio, where multi-channel streams may be preferable on platforms with multi-channel output capabilities, in spite of their increased bitrate.

I propose the following for solve the problem:

  1. Add a preferredAudioNumChannelLanguage into configuration
  2. Improve selectAudioLanguage function to allow language, role and numChannels
  3. Change the current heuristic for audio codec selection.
  • Choose the codec that has a track with the number of channels closest to the preference (preferredAudioNumChannelLanguage).
  • If there are several that meet this requirement, choose the one with the lowest bandwidth.
  • If there are not preferredAudioNumChannelLanguage configuration, follow the current logic of "most efficient" codec by bitrate

Example with: ec-3, ac-3 and acc: https://tungsten.aaplimg.com/VOD/bipbop_adv_example_hevc/master.m3u8 (from https://developer.apple.com/streaming/examples/)

Audio tracks:

  • AAC-LC - stereo @ 160 kbps
  • AC-3 - 5.1 @ 384 kbps
  • EC-3 - 5.1 @ 192 kbps

In Safari Browser or Edge Browser or Tizen 2017 support all previous codecs:
case 1) preferredAudioNumChannelLanguage = 2, the player should choose AAC-LC - stereo @ 160 kbps
case 2) preferredAudioNumChannelLanguage = 6, the player should choose EC-3 - 5.1 @ 192 kbps

@vaage vaage added needs triage type: enhancement New feature or request and removed needs triage labels Sep 7, 2017
@vaage
Copy link
Contributor

vaage commented Sep 7, 2017

@joeyparrish Could you take a look at this request and schedule it for a milestone if it is something we want to take on?

@avelad
Copy link
Member Author

avelad commented Oct 3, 2017

@joeyparrish , can you look it?

@joeyparrish
Copy link
Member

Sorry, I missed this. Let me take a look.

@joeyparrish
Copy link
Member

I'd like to offer one refinement to your proposal:

  • Choose the codec that has a track with the number of channels closest to the preference (preferredAudioNumChannelLanguage).

I'd like to change this to:

  • Choose the codec with the largest number of audio channels less than or equal to the configured number of output channels. If this is not possible, choose the smallest number of channels.

For a 6-channel system, we should use 6-channel audio tracks if possible. For a 2-channel system, we should never use 6-channel audio unless there's nothing else.

Further, the default for configured output channels should be 2. If applications do not explicitly tell us that there are more output channels, we should assume 2-channel output capabilities.

Some day, if the Media Capabilities API can give us a hint about the actual output capabilities of the device, this could be used as the default configuration instead of "2".

How does that sound?

Since v2.3 is getting heavy, I'm going to put this in the backlog for now. I expect we'll start planning v2.4 soon, though, and this seems like a good candidate for v2.4.

@joeyparrish joeyparrish added this to the Backlog milestone Oct 3, 2017
@avelad
Copy link
Member Author

avelad commented Oct 4, 2017

@joeyparrish , I like your refinement! Right now there is already version 2.4.0, can you schedule it?

@joeyparrish joeyparrish modified the milestones: Backlog, v2.4.0 Oct 4, 2017
@joeyparrish
Copy link
Member

Done.

@avelad
Copy link
Member Author

avelad commented Apr 9, 2018

hi @michellezhuogg, @joeyparrish I just tested in Safari and Tizen (both with support for aac and ac-3) and the change is not working properly.

If the content is multicodec (eg: https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8 sample of Apple (the has not cors enabled, use developer settings for disable it))

In the player load function (https://github.com/google/shaka-player/blob/master/lib/player.js#L676), chooseCodecsAndFilterManifest_ (https://github.com/google/shaka-player/blob/master/lib/player.js#L773) is called before of filterVariantsByConfig (https://github.com/google/shaka-player/blob/master/lib/player.js#L797)

Even if preferredAudioChannelCount = 6 is used, it is not possible to use the 5.1 variant with 6 channels.
Can you review it?

@joeyparrish joeyparrish reopened this Apr 9, 2018
@joeyparrish
Copy link
Member

Thanks for letting us know. We'll take another look.

@joeyparrish
Copy link
Member

@avelad, the fact that these are all using different codecs is confusing this issue. MediaSource doesn't support changing codecs during playback (yet). So we have to pick one codec at the beginning and stick with it. This means regardless of how we make audio channel decisions, this particular clip will never have more than one channel configuration to choose from during playback. Once playback begins, we're locked into a codec.

I will try to produce a sample of our own that will work better for testing this feature, where all channel counts are available in the same set of audio codecs.

@joeyparrish
Copy link
Member

I found a copy of the free movie "Tears of Steel" in surround sound: http://ftp.nluug.nl/pub/graphics/blender/demo/movies/ToS/

I will package and publish this.

@joeyparrish
Copy link
Member

For content where codec choice does not restrict choice of channel count, there are some issues with the fix we already made:

  1. The channel preference is not correctly used for the initial content selection if the config was changed between construction and load().
  2. The AbrManager's options are not updated if the user manually chooses a track with a different number of channels. When Abr is re-enabled, the channel count reverts to its configured setting.

@joeyparrish joeyparrish self-assigned this Apr 9, 2018
shaka-bot pushed a commit that referenced this issue Apr 10, 2018
Issue #1013

Change-Id: I428192b300f8d7d6175ad02bae0f01eb70d8e195
shaka-bot pushed a commit that referenced this issue Apr 10, 2018
In the initial fix for #1013, we changed the name of the channelsCount
field in both the Track and Stream structures.  This would break
compatibility for applications.  So even though the new name was in
some ways preferable, we must revert the name to avoid more breaking
changes in v2.4.

Issue #1013

Change-Id: Ie8f3d211c42c8046039a3db9f0926c68ad1315d9
@joeyparrish
Copy link
Member

Third issue: if tracks change due to a key status change, the current channel count and config get ignored, and AbrManager gets to choose from all tracks.

I should have fixes out for all of these issues soon.

shaka-bot pushed a commit that referenced this issue Apr 13, 2018
Fake tracks now have multiple audio channels and a more realistic
variant layout.  The tests that use these tracks have been updated to
be less brittle by not relying on specific array indices or ID values.

This is a prelude to adding new tests for the issues found in #1013.

Issue #1013

Change-Id: If091674ba7a2de29c77c81556415a855f85b18af
joeyparrish added a commit that referenced this issue Apr 13, 2018
Fake tracks now have multiple audio channels and a more realistic
variant layout.  The tests that use these tracks have been updated to
be less brittle by not relying on specific array indices or ID values.

This is a prelude to adding new tests for the issues found in #1013.

Issue #1013

Change-Id: If091674ba7a2de29c77c81556415a855f85b18af
joeyparrish added a commit that referenced this issue Apr 13, 2018
1. The channel preference was not correctly used for the initial
   content selection if the config was changed between construction
   and load().
2. The AbrManager's options were not updated if the user manually
   chose a track with a different number of channels. When Abr was
   re-enabled, the channel count reverted to its configured setting.
3. If tracks changed due to a key status change, the current channel
   count and config were ignored, and AbrManager got to choose from
   all tracks again.

Tests were added/updated for all three issues.

Closes #1013

Change-Id: Iec49361aa6e8c7193a572ad7914dfd853454791d
@avelad
Copy link
Member Author

avelad commented Apr 16, 2018

@joeyparrish , is it possible add configuration of preferredAudioChannelCount to Demo App?

@joeyparrish
Copy link
Member

Sure, no problem.

shaka-bot pushed a commit that referenced this issue Apr 16, 2018
Issue #1013

Change-Id: I55cf86bd7b41d98155a4b4346277869cb4baa15c
@avelad
Copy link
Member Author

avelad commented Apr 17, 2018

@joeyparrish , the configuration in Demo app is failing:

error

@avelad
Copy link
Member Author

avelad commented Apr 17, 2018

I'm reviewing the last code and testing and I think that preferredAudioChannelCountis not working how i expected.

I have the next mpd:

<?xml version="1.0" encoding="utf-8"?>
<MPD
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns="urn:mpeg:dash:schema:mpd:2011"
  xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
  type="static"
  mediaPresentationDuration="PT3M28.981333S"
  maxSegmentDuration="PT3S"
  minBufferTime="PT0S"
  profiles="urn:mpeg:dash:profile:isoff-live:2011">
  <Period
    duration="PT3M28.981333S">
    <BaseURL>dash/</BaseURL>
    <AdaptationSet
      group="1"
      contentType="audio"
      lang="en"
      segmentAlignment="true"
      audioSamplingRate="48000"
      mimeType="audio/mp4"
      codecs="ac-3">
      <AudioChannelConfiguration
        schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
        value="6">
      </AudioChannelConfiguration>
      <SegmentTemplate
        timescale="48000"
        initialization="60f_Encompass1080p_5-1-$RepresentationID$-init.mp4"
        media="60f_Encompass1080p_5-1-$RepresentationID$-$Time$.mp4">
        <SegmentTimeline>
          ...
        </SegmentTimeline>
      </SegmentTemplate>
      <Representation
        id="audio_eng_1=256000"
        bandwidth="256000">
      </Representation>
    </AdaptationSet>
    <AdaptationSet
      group="1"
      contentType="audio"
      lang="en"
      segmentAlignment="true"
      audioSamplingRate="24000"
      mimeType="audio/mp4"
      codecs="mp4a.40.2">
      <AudioChannelConfiguration
        schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
        value="2">
      </AudioChannelConfiguration>
      <SegmentTemplate
        timescale="24000"
        initialization="60f_Encompass1080p_5-1-$RepresentationID$-init.mp4"
        media="60f_Encompass1080p_5-1-$RepresentationID$-$Time$.mp4">
        <SegmentTimeline>
          ...
        </SegmentTimeline>
      </SegmentTemplate>
      <Representation
        id="audio_eng=94726"
        bandwidth="94726">
      </Representation>
    </AdaptationSet>
    <AdaptationSet
      id="3"
      group="2"
      contentType="video"
      lang="en"
      par="200:113"
      minBandwidth="111000"
      maxBandwidth="6562000"
      minWidth="400"
      maxWidth="1920"
      minHeight="226"
      maxHeight="1080"
      segmentAlignment="true"
      frameRate="60000/1001"
      mimeType="video/mp4"
      startWithSAP="1">
      <SegmentTemplate
        timescale="60000"
        initialization="60f_Encompass1080p_5-1-$RepresentationID$-init.mp4"
        media="60f_Encompass1080p_5-1-$RepresentationID$-$Time$.mp4">
        <SegmentTimeline>
          <S t="0" d="60060" r="207" />
          <S d="40040" />
        </SegmentTimeline>
      </SegmentTemplate>
      <Representation
        id="video_eng=111000"
        bandwidth="111000"
        width="400"
        height="226"
        codecs="avc1.42001F"
        scanType="progressive">
      </Representation>
      <Representation
        id="video_eng=438000"
        bandwidth="438000"
        width="400"
        height="226"
        sar="226:225"
        codecs="avc1.42001F"
        scanType="progressive">
      </Representation>
      <Representation
        id="video_eng=1211000"
        bandwidth="1211000"
        width="640"
        height="360"
        codecs="avc1.4D001F"
        scanType="progressive">
      </Representation>
      <Representation
        id="video_eng=2114000"
        bandwidth="2114000"
        width="852"
        height="480"
        codecs="avc1.4D0029"
        scanType="progressive">
      </Representation>
      <Representation
        id="video_eng=6562000"
        bandwidth="6562000"
        width="1920"
        height="1080"
        codecs="avc1.64002A"
        scanType="progressive">
      </Representation>
    </AdaptationSet>
  </Period>
</MPD>

The stream has 2 audios:

  • 6 channels with codec ac-3
  • 2 channels with codec mp4a.40.2

Testing in Safari & Tizen TV (2017)

If I configure preferredAudioChannelCount = 6, I expect that the player choose 6 channels with codec ac-3, but the function shaka.Player.prototype.chooseCodecsAndFilterManifest_ does not have the preferredChannelCount preferred value into account, and the player choose the codec with less bandwidth and it doesn't respect preferredAudioChannelCount preference.

@joeyparrish
Copy link
Member

We can't choose between 6-channel AC-3 and 2-channel AAC at runtime because MediaSource does not yet support switching codecs during playback. This is something browser vendors are working on, but for now, the player must choose a single codec when we load the manifest.

But you are right that we should account for the preference when making the initial codec choice. I will reopen and investigate. Thank you for bringing this to our attention!

@joeyparrish joeyparrish reopened this Apr 17, 2018
@avelad
Copy link
Member Author

avelad commented Apr 17, 2018

For me it's OK, if my preference is 6 channels, i don't have any problem if the user can not choose 2 channels.

@joeyparrish
Copy link
Member

Understood. Today, the user can choose between 2 and 6 channels if the codec is the same. If the codec differs, we should take the channel preference into account when choosing codecs. In the future, when MediaSource supports codec switching, the user will be able to choose between 2 and 6 channels during playback even if the codec differs.

joeyparrish added a commit that referenced this issue Apr 17, 2018
Issue #1013

Change-Id: I55cf86bd7b41d98155a4b4346277869cb4baa15c
@avelad
Copy link
Member Author

avelad commented Apr 20, 2018

@joeyparrish, I see the same error that in my previous comment #1013 (comment)

Selection of ac-3 in some cases (aac 2.0 & ac-3 5.1) is working perfectly

shaka-bot pushed a commit that referenced this issue Apr 20, 2018
This also disallows valueAsNumber, which is unsupported on IE11 and
Edge.

Issue #1013

Change-Id: I82868b31e0add4f1cac80ea9ddeaccad6f9ffdb7
@joeyparrish
Copy link
Member

@avelad, we believe we found the error you are referring to in the demo app and fixed it. Please let us know if this is working for you.

@avelad
Copy link
Member Author

avelad commented Apr 23, 2018

Now it works! thanks

@joeyparrish
Copy link
Member

Glad to hear it! Thank you for being patient with us while we got this right.

@avelad
Copy link
Member Author

avelad commented Apr 23, 2018

Do you know when version 2.4 will come out?

@joeyparrish
Copy link
Member

If anybody knew, it would be me... :-)

We're working to wrap up v2.4, but we are blocked on #1248, to restore backward compatibility for our offline storage system. We hope to be done with that by the end of the month.

joeyparrish added a commit that referenced this issue Apr 23, 2018
Previously, we would choose codecs based on bandwidth alone.  This
would lead to 2-channel audio being preferred over 6-channel audio
when the two used different codecs.  (Content with different channel
counts using the same codec are not affected.)

Now, we consider the channel preference before choosing a codec.  This
fixes app selection of 6-channel codecs over 2-channel codecs when
they differ.

Closes #1013

Backported to v2.4.x

Change-Id: Iee6058b2df04b8b8036f59909fd82df26b1173ae
joeyparrish pushed a commit that referenced this issue Apr 23, 2018
This also disallows valueAsNumber, which is unsupported on IE11 and
Edge.

Issue #1013

Backported to v2.4.x

Change-Id: I82868b31e0add4f1cac80ea9ddeaccad6f9ffdb7
@joeyparrish
Copy link
Member

This fix will be out in v2.4 this week. Thanks for your patience!

@shaka-project shaka-project locked and limited conversation to collaborators Jun 18, 2018
@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Apr 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants