-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FORMAT] Parse ITU-T T.35 data & SCTE-128 DTVCC Captions in H.264 #533
Comments
Hey, thanks for the very detailed and well formated issue!
Thanks for all the praises! yes very on topic, it was and is designed and used to do things just like this :)
Yes that is main idea, try to only decorate and present without much interpretation. There is some formats that do require interpretation do be more useful, like demux or traverse samples tables etc. Not sure if it makes sense in this case but some format fq has additional fq functions to convert the detailed decode structure into something more close to the intended form. Another way it do provide various useful jq snippets in the documentation also. I've created an initial PR at #534 so it's easier to discuss and collaborate. Lots in the PR is ugly and probably used the wrong naming and terminology, will need your help to sort that out. I think we can start of by focusing on the things you want for now and then see how it goes. It would be great to get some dump of the SEI data, if the data is no senitive i think something like should write the payload to a file:
alternatively or additionally do you know if some tools like ffmpeg can produce mp4 with this kind of metadata? it's very valuable to have a somewhat diverse set of blob of a format to do a good decoder. What platform are you on and are you able to build your own version of fq? you should be able to build the PR branch by doing something like:
with that and some luck you should be able to decode some of it. This is how are i get with your dump from above, looks sane? i get it fails because it's truncated
|
Closing, since it is covered under separate branch & PR. |
Firstly can I state that
fq
absolutely rocks for the analysis of media files, and is the best parser for H.264 headers.There are a very limited number of open source tools that go to the effort of allowing H.264 SEI headers to be easily decoded. On behalf of your user-base, many thanks for all your work!
In fact, I like
fq
so much, I would like to ask for the AVC & SEI parser to be enhanced so that it can decode DTVCC / SCTE-128 closed caption headers in order to index the tracks contained within. My interest is specifically H.264 files, but a generic parser could also service other/formats/mpeg/*
such as H.262 picture user data and H.265 SEI data.I appreciate that
fq
started with a heavy focus on media files, and thus I hope this request would be interpreted as being on topic.At present, fq successfully decodes as far as detecting "user_data_registered_itu_t_t35" in the SEI. This is cool, and beyond that of some basic parsers.
SCTE 128 Closed Captioning
Closed captioning is important for audiences with accessibility requirements to be able to enjoy and consume content with equal access as to those without accessibility needs. Video engineers need to be able to debug closed captioning in order to service those users.
There are many standards for Closed Captioning, some of which define how to transmit the data in an MPEG stream (MPEG2 SCTE-20, ATSC A/53, H.264 SCTE-128), some of which define the encoding and decoding of the characters in the payload (EIA-608/708). The standards for ATSC A/53 closed captioning in SEI side data for H.264 share commonality with H.262/mpeg2 closed captioning in picture user data. Higher-level data-structures such as EIA-608/708 are common to all of H.262/264/265 use cases.
My primary interest is DTVCC SCTE-128 Closed Captioning in H.264, since AVC is the most widely used video codec. While SCTE-128 is a North America-centric standard, it is now commonly used worldwide in HLS & DASH protocols for the carriage of EIA-608/708. SCTE-128 DTVCC (with EIA-608) is the most common method of delivering closed captions to end-user players and devices in HLS and DASH streams.
All of the following format request request refers to freely available documentation.
Current support in fq
Here is the
fq
command that I use to inspect a sample file containing DTVCC closed captioning.At present,
fq
is capable of identifying that the SEI side data contains a T.35 Country Code.I would like to use
fq
to be able to dig deeper into the ITU-T.35 header, specifically for the purpose of inspecting SCTE-128 DTVCC closed captions, which in media terms is a very common type of closed captioning. There are very few closed caption tools that provide raw debug analysis - most tools (such as ccextractor or caption inspector) go the extra step and perform a conversion, which is can often change the measured result. I likefq
because it attempts to decode rather than interpret or convert data.The closed caption rawdata starts as user_data_registered_itu_t_t35, which
fq
already knows about. Within the user_data_registered_itu_t_t35 there is rawdata....SCTE-128 DTVCC Format (Summary)
From: https://www.scte.org/documents/373/ANSI_SCTE-128-1-2020-1586877225672.pdf
itu_t_35_country_code
Within the user_data_registered_itu_t_t35, the ITU publishes a full lookup table, with country codes. Here is a link with all the current T.35 country codes. Standard is freely and publicly available and is maintained by ITU.
If a kind developer were to add an ITU T.35 Country Code Parser, I have extracted the values from the ITU documentation and formatted in what would be a useful way, in the hope of making a developer's task easier.
t35_country_codes.txt
The T.35 country code for DVTCC closed captioning is always
B5
(since it is a standard derived from ATSC in the United States). A more comprehensive, generic T.35 parser infq
could include all country codes from the above pre-formatted file.Where
itu_t_35_country_code
of value{0xb5: "United States"}
itu_t_35_provider_type
Each countries' standards body gets to define and scope their provider_type. DTVCC closed captioning is a US-standard. DTVCC 608/708 closed captions are used around the world in protocols such as HLS).
The provider_type for SCTE-128 Closed Captions is always
0x0031
.Where
itu_t_35_provider_type
is{0x0031: SCTE-128 DTVCC}
scte_128_user_identifier
In SCTE-128, the following user identifiers are defined:
From a selfish perspective, I am interested in the ATSC data, however the same SCTE-128 document does also cover Active Format Description signalling which may prove valuable to others (in case it also piques your interest or was on your to-do-sometime-in-the-future list).
Where
scte_128_user_identifier
is{0x31474139: "ATSC data (GA94)"}
ATSC1_data ("GA94")
Within the
ATSC1_data
header, there are registered codes. The one which is interesting from a Closed Caption perspective is 0x03 cc_data.This information is also published and maintained by ATSC in the last tab of this freely accessible, public Code Point Registry spreadsheet...
Where
atsc1_data
is{0x03: "ccdata"}
.ccdata() = EIA-708
The ccdata() is defined in EIA-708, and while I would love to extend this format request to include EIA-708 headers, enabling fq to list headers that relate to the captions tracks (608 CC1/2/3/4) and (708 SERVICE1-6), I appreciate that this post is already long enough. In simple terms the
ccdata()
payload contains headers which list the CC1/2/3/4 tracks header and payload (aka 608 compatibility mode), followed by the headers and payload for full 708 data. The EIA-708 spec has traditionally been pay-to-play and not available for the general public, but has now been made available free-of-charge and "Available to Everyone" from the CTA (registration required).https://shop.cta.tech/products/digital-television-dtv-closed-captioning
The EIA-708 standard defines the methods to decode ccdata() headers.
Please do not interpret this as a request for the decoding of closed captioning to human readable text. I consider
fq
to be a header inspector and thus indexing and listing the available closed captioning tracks in the SEI would be of great benefit to me and other video technicians, butfq
does not attempt to be a video decoder, so therefore it need not attempt to decode the actual encoded text in the payload.While the above list may be long, even
fq
was to implement the lower-level headers without ccdata(), it would be very beneficial.I am not a developer, and while I can just about understand what https://github.com/wader/fq/blob/master/format/mpeg/avc_annexb.go and https://github.com/wader/fq/blob/master/format/mpeg/avc_sei.go are doing, I am not capable of writing a format interpreter in
go
, that is beyond my skillset. But I hope that you will appreciate that I have tried to present data such as t35_country_codes.txt in a data structure which would be compatible with the current parsers.I hope you find the above easy to read. i have put a lot of care and attention into making the github markup clear and I have referenced any standards in the format request.
I do have a repository of some (copyrighted) media streams for the above from US broadcasters, which I could share privately, in the interests of legitimate research, however I have not attempted to attach these to your github repo to as to avoid you being flagged.
I am including my version of
fq
out of politeness since that is requested on all new tickets, although it is unlikely to be relevant for a format request.Even if this format request is not of interest, thanks for developing such a great tool and taking the time to read this far!
The text was updated successfully, but these errors were encountered: