-
-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avcodec: add remove_dovi and remove_hdr10plus option to hevc,av1_metadata bsf #480
Conversation
Are this changes still relevant when we bump to FFmpeg 7.1+? The newly added Regarding dropping HDR10+ from HEVC, does using If we are not in a rush to use this in the upcoming JF 10.10 we can wait for FFmpeg 7.1.x. Otherwise we will have to continue adding bsf option detection code to take advantage of these two custom options. |
That one needs slightly more tweak because that only removes the RPU but it will keep the EL, which may still cause compatibility issues. If we are going to upgrade to that version soon then we can modify that one instead.
This could be too aggressive as |
We have to do that anyway because the upstream filter does not fit our needs. Even if we upstreamed our changes we still need to do the checks because a release contains our bsf code will unlikely land before 10.11. The more problematic part using those is how we are handling range. Both from the probe side and the playback side. HDR10+ metadata is unavailable until you really decode one of the frame to get the side data but ffprobe does not have clean way to stop at desired location. Our current data structure also cannot handle the case where Dolby Vision and HDR10+ metadata coexisting in the same file. The clients cannot accurately report the actual capability either. For example, if it can correctly handle Dolby Vision profile 7 fallback, does it allow out of spec profile 8.6 playback, can it handle dovi and hdr10+ coexisting and so on. So we do have a lot of more work to use these bsf filters. |
https://ffmpeg.org//pipermail/ffmpeg-devel/2024-October/334998.html I submitted the patch which will also remove EL to upstream without changing the CLI interface. Hope that could get accepted so that we can maintain one less bsf. |
Found a method to extract the HDR10+ metadata info from the very first packet/frame.
Result{
"frames": [
{
"side_data_list": [
{
"side_data_type": "Mastering display metadata",
"red_x": "34000/50000",
"red_y": "16000/50000",
"green_x": "13250/50000",
"green_y": "34500/50000",
"blue_x": "7500/50000",
"blue_y": "3000/50000",
"white_point_x": "15635/50000",
"white_point_y": "16450/50000",
"min_luminance": "1/10000",
"max_luminance": "10000000/10000"
},
{
"side_data_type": "Content light level metadata",
"max_content": 1000,
"max_average": 168
},
{
"side_data_type": "HDR Dynamic Metadata SMPTE2094-40 (HDR10+)",
"application version": 1,
"num_windows": 1,
"targeted_system_display_maximum_luminance": "400/1",
"maxscl": "0/100000",
"maxscl": "0/100000",
"maxscl": "0/100000",
"average_maxrgb": "0/100000",
"num_distribution_maxrgb_percentiles": 9,
"distribution_maxrgb_percentage": 1,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 5,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 10,
"distribution_maxrgb_percentile": "100/100000",
"distribution_maxrgb_percentage": 25,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 50,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 75,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 90,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 95,
"distribution_maxrgb_percentile": "0/100000",
"distribution_maxrgb_percentage": 99,
"distribution_maxrgb_percentile": "0/100000",
"fraction_bright_pixels": "0/1000",
"knee_point_x": "0/4095",
"knee_point_y": "0/4095",
"num_bezier_curve_anchors": 9,
"bezier_curve_anchors": "102/1023",
"bezier_curve_anchors": "205/1023",
"bezier_curve_anchors": "307/1023",
"bezier_curve_anchors": "410/1023",
"bezier_curve_anchors": "512/1023",
"bezier_curve_anchors": "614/1023",
"bezier_curve_anchors": "717/1023",
"bezier_curve_anchors": "819/1023",
"bezier_curve_anchors": "922/1023"
},
{
"side_data_type": "Dolby Vision RPU Data"
},
{
"side_data_type": "Dolby Vision Metadata",
"rpu_type": 2,
"rpu_format": 18,
"vdr_rpu_profile": 1,
"vdr_rpu_level": 0,
"chroma_resampling_explicit_filter_flag": 0,
"coef_data_type": 0,
"coef_log2_denom": 23,
"vdr_rpu_normalized_idc": 1,
"bl_video_full_range_flag": 0,
"bl_bit_depth": 10,
"el_bit_depth": 10,
"vdr_bit_depth": 12,
"spatial_resampling_filter_flag": 0,
"el_spatial_resampling_filter_flag": 1,
"disable_residual_flag": 0,
"vdr_rpu_id": 0,
"mapping_color_space": 0,
"mapping_chroma_format_idc": 0,
"nlq_method_idc": 0,
"nlq_method_idc_name": "linear_dz",
"num_x_partitions": 2047,
"num_y_partitions": 1,
"components": [
{
"pivots": "0 128 256 384 512 640 768 896 1023",
"pieces": [
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "4948 8388608"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "9896 8349022"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "0 8388608"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "0 8388608"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "0 8388608"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "0 8388608"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "0 8388608"
},
{
"mapping_idc": 0,
"mapping_idc_name": "polynomial",
"poly_order": 1,
"poly_coef": "0 8388608"
}
],
"nlq_offset": 512,
"vdr_in_max": 1048576,
"linear_deadzone_slope": 2048,
"linear_deadzone_threshold": 0
},
{
"pivots": "0 1023",
"pieces": [
{
"mapping_idc": 1,
"mapping_idc_name": "mmr",
"mmr_order": 3,
"mmr_constant": 0,
"mmr_coef": "0 6391320 0 0 0 0 0 0 3195660 0 0 0 0 0 0 1597830 0 0 0 0 0"
}
],
"nlq_offset": 512,
"vdr_in_max": 1048576,
"linear_deadzone_slope": 2048,
"linear_deadzone_threshold": 0
},
{
"pivots": "0 1023",
"pieces": [
{
"mapping_idc": 1,
"mapping_idc_name": "mmr",
"mmr_order": 3,
"mmr_constant": 418,
"mmr_coef": "53 427 6391320 26 26 213 13 3 213 3195660 0 0 53 0 0 106 1597830 -210 -210 13 0"
}
],
"nlq_offset": 512,
"vdr_in_max": 1048576,
"linear_deadzone_slope": 2048,
"linear_deadzone_threshold": 0
}
],
"dm_metadata_id": 0,
"scene_refresh_flag": 1,
"ycc_to_rgb_matrix": "9574/8192 0/8192 13802/8192 9574/8192 -1540/8192 -5348/8192 9574/8192 17610/8192 0/8192",
"ycc_to_rgb_offset": "16777216/268435456 134217728/268435456 134217728/268435456",
"rgb_to_lms_matrix": "7222/16384 8771/16384 390/16384 2654/16384 12430/16384 1300/16384 0/16384 422/16384 15962/16384",
"signal_eotf": 65535,
"signal_eotf_param0": 0,
"signal_eotf_param1": 0,
"signal_eotf_param2": 0,
"signal_bit_depth": 12,
"signal_color_space": 0,
"signal_chroma_format": 0,
"signal_full_range_flag": 1,
"source_min_pq": 7,
"source_max_pq": 3079,
"source_diagonal": 42
}
]
}
]
} |
This is still not ideal because now we need to run ffprobe one more time to probe for audio and extra data. I thought about something similar as well but I don't like the idea of running ffprobe multiple times. |
You don't need to run it again, |
with |
You may be right but I don't have such file (audio pkt muxed before video pkt) to verify, so reading only one pkt is not able to get valid information. |
For DoVi we can use the upstream I found an official sample of AV1/HDR10+ from AOMediaCodec, maybe it's worth adding something similar to |
Both |
This adds two options,
remove_dovi
andremove_hdr10plus
, to thehevc_metadata
bitstream filter. These options are useful in cases where certain types of dynamic metadata trigger player bugs, such as on some smart TVs and set-top boxes. With these options, clients can have the video remuxed with the selected dynamic metadata removed.For example:
One thing to note is that current ffmpeg implementation of matroska is bugged where it will always attempt to carry over dolby vision metadata when the input has one, and if that is removed by this bsf it will complain and exit. Fixing the mkv handling is out of the scope of this PR. The workaround is to use mp4 as target container when stripping dovi meatadata.
The upstream made an overhaul on the dovi rpu related code so this is not useful in the future when we upgrade to that version. A back-port might be too heavy as that is a big refactor.
Changes
Issues