-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SubStation Alpha (SSA) support #3060
Conversation
Before merging this, it is necessary to previously accept #3036 so that I can add a part that is the same for the conversion to WebVTT in src=. |
Merge ready, it is now possible to review this PR. |
@joeyparrish , do you think you could find a time this week to review this? |
/** @type {!Array.<!shaka.extern.Cue>} */ | ||
const cues = []; | ||
const parts = str.split(/\r?\n\s*\r?\n/); | ||
for (const part of parts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be easier to understand if you split the raw parsing and the name handling into separate parts. For example, have a parse method that produces a nested object map:
{
"Script Info": {
"Title": "Foo",
"Original Script": ""
},
"V4+ Styles": {
"Format": "..."
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to think of a way to make it easier, and hope to come back tomorrow with a different implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting a more generic parser that didn't check for tag names. For example, something like:
const ret = {};
let section = null;
for (const line of lines) {
const match = /\[([^\]]+)\]/.exec(line);
if (match) {
section = match[1];
ret[section] = {};
} else if (/\s*;/.test(line)) {
// Skip comments.
} else {
const parts = line.split(':');
ret[section][parts[0].trim()] = parts.slice(1).join(':').trim();
// You'll need to special-case Dialogue to support multiple cues.
}
}
I would also suggest putting this in another function to convert the data buffer to the object. Then you can just use data['V4 Styles']
or data['Events']
later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that it is a popular format, what is the advantage of doing what you propose? At this time, what is supported are known tags and those that we add to sections. Adding unknown sections I don't think has any advantage ... Sorry if I didn't understand you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code "adds" unknown sections too by ignoring them in the if statements below. I was suggesting we separate the text parsing from the handling of the sections. Like you would split XML parsing from the reading of the elements. By splitting them up, you can focus on reading the tag names without seeing all the regex and string parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I still don't see it :(. Right now with the last changes the events and styles are already separated, and also the parsing of them is made independent. Adding a generic parse here does not add much since it is a known format.In your example above, script information is added to the object that is not really useful.
On the other hand, even if I do a generic parse, I still have to know the tags.
The execution time of my proposal and yours should be similar.
@TheModMaker I have applied all your comments. Can you review again? |
Ready for review! |
@@ -33,16 +33,16 @@ shaka.text.LrcTextParser = class { | |||
* @export | |||
*/ | |||
parseMedia(data, time) { | |||
return shaka.text.LrcTextParser.getCues_(data, time.segmentEnd); | |||
return shaka.text.LrcTextParser.getCues(data, time.segmentEnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this just forwards, I suggest just using parseMedia
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot do this because getCues is static and is necessary in the player.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of my generic suggestion was to avoid checking if (mimeType == 'foo')
and instead use the parser interface directly. Doing something like this:
const factory = TextEngine.findParser(mimeType);
if (factory) {
const obj = factory();
cues = obj.parseMedia(data);
}
But you don't have to do that now. You could do it in another PR or I could do it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to address it in another PR, I keep it on my radar. Thanks!
/** @type {!Array.<!shaka.extern.Cue>} */ | ||
const cues = []; | ||
const parts = str.split(/\r?\n\s*\r?\n/); | ||
for (const part of parts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting a more generic parser that didn't check for tag names. For example, something like:
const ret = {};
let section = null;
for (const line of lines) {
const match = /\[([^\]]+)\]/.exec(line);
if (match) {
section = match[1];
ret[section] = {};
} else if (/\s*;/.test(line)) {
// Skip comments.
} else {
const parts = line.split(':');
ret[section][parts[0].trim()] = parts.slice(1).join(':').trim();
// You'll need to special-case Dialogue to support multiple cues.
}
}
I would also suggest putting this in another function to convert the data buffer to the object. Then you can just use data['V4 Styles']
or data['Events']
later.
/** @type {!Array.<!shaka.extern.Cue>} */ | ||
const cues = []; | ||
const parts = str.split(/\r?\n\s*\r?\n/); | ||
for (const part of parts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code "adds" unknown sections too by ignoring them in the if statements below. I was suggesting we separate the text parsing from the handling of the sections. Like you would split XML parsing from the reading of the elements. By splitting them up, you can focus on reading the tag names without seeing all the regex and string parsing.
@@ -33,16 +33,16 @@ shaka.text.LrcTextParser = class { | |||
* @export | |||
*/ | |||
parseMedia(data, time) { | |||
return shaka.text.LrcTextParser.getCues_(data, time.segmentEnd); | |||
return shaka.text.LrcTextParser.getCues(data, time.segmentEnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of my generic suggestion was to avoid checking if (mimeType == 'foo')
and instead use the parser interface directly. Doing something like this:
const factory = TextEngine.findParser(mimeType);
if (factory) {
const obj = factory();
cues = obj.parseMedia(data);
}
But you don't have to do that now. You could do it in another PR or I could do it instead.
Test Failure:
|
Fixed lint error. Sorry.... |
All tests passed! |
1. Added hdr as a property in stream when constructing. Fixes build failure from commit 7137286 . PR #3116 Issue #2813 2. Fixed the test error from commit d3640d1 . PR #3044 Issue #3029 3. Fixed the new line with no other arguments from commit 0845843 . PR #3060 Change-Id: I5833e49c1a95172742c4ec820960c9c5a7bf0cca
Add SubStation Alpha (SSA) support
Format info: https://en.wikipedia.org/wiki/SubStation_Alpha
Note: SSA and ASS are supported