-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Allow VTT files with erroneous linebreaks #2394
Conversation
The VTT spec does not allow more than one linebreak within the same cue. It's desirable to still allow most of the VTT file to work even if the file is technically out of spec, however. This commit allows such files by removing a thrown exception in the case of bad time codes, instead returning null, skipping the erroneous cue. The text_parser also needed to be patched, as it previously threw a perhaps unexpected exception when data_.length is accessed on a null or undefined data object.
test/text/vtt_text_parser_unit.js
Outdated
], | ||
'WEBVTT\n\n' + | ||
'00:00:20.000 --> 00:00:40.000\n' + | ||
'\nTest\n\nExtra line\n\n' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you really want this text ignored instead of represented in the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to display it, but after a lot of noodling I can't see how to do so without using cowboy hijinks counter to other implementations I've seen. For instance, dash.js gives:
[9751][VTTParser] Skipping cue due to empty/malformed cue text
The current parser has good adherence to the spec in the implemented areas. I'd like to change that as little as possible while also giving a best effort towards displaying the good parts of a VTT file.
Extra newlines before the cue body and within the cue body are both VTT errors which can be ignored.
Previously an exception was thrown here, which caused the entire VTT stream to fail parsing. Some notification of failure is still warranted, so we issue a warning. This is in line with the docstring for warnings: "[If] we work around unusual or bad content".
@bloomtom can you rebase the PR? Thanks! |
Closing due to inactivity. If you need to reopen this issue, just put @shaka-bot reopen in a comment. Thanks! |
Going to work on this again. |
I'm reopening this PR as requested, but I am still not convinced of the value of it. I don't think errors should be ignored or malformed inputs casually skipped. But if you still want to make a case for it, I'm listening. |
My case is that a single error within a VTT file shouldn't prevent the user from seeing any of the caption data. It doesn't help anyone to do so when you can indicate a partial failure with a warning and move on. If you try to play bad VTT on the current day demo page, you still get the lovely "this.data_ is undefined" error, which is the worst of both worlds. |
bc30ecd
to
051ee99
Compare
051ee99
to
9811b42
Compare
Incremental code coverage: 100.00% |
@joeyparrish can you review it? Thanks! |
lib/text/vtt_text_parser.js
Outdated
shaka.util.Error.Category.TEXT, | ||
shaka.util.Error.Code.INVALID_TEXT_CUE, | ||
'Could not parse cue time range in WebVTT'); | ||
shaka.log.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this shaka.log.alwaysWarn
, so the warning is kept in production builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Bad linebreaks will now cause cues to be skipped (with a warning), rather than throwing an error.
Closes #2358