-
-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable media playback in JQuery mode #441
Conversation
www/js/app.js
Outdated
// It first searches for <img, <script, or <link, then scans forward to find, on a word boundary, either src=["'] | ||
// OR href=["'] (ignoring any extra whitespace), and it then tests everything up to the next ["'] against a pattern that | ||
// matches ZIM URLs with namespaces [-I] ("-" = metadata or "I" = image). Finally it removes the relative or absolute path. | ||
// DEV: If you want to support more namespaces, add them to the END of the character set [-I] (not to the beginning) | ||
var regexpTagsWithZimUrl = /(<(?:img|script|link)\s+[^>]*?\b)(?:src|href)(\s*=\s*["']\s*)(?:\.\.\/|\/)+([-I]\/[^"']*)/ig; | ||
var regexpTagsWithZimUrl = /(<(?:img|script|link|video|audio|source)\s+[^>]*?\b)(?:src|href)(\s*=\s*["']\s*)(?:\.\.\/|\/)+([-I]\/[^"']*)/ig; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick explanation here: all video and audio I've encountered have been in the I
namespace. The spec doesn't mention video or other media, and no specific namespace seems to exist for it. There are two image namespaces, with the second one being J
, but the detailed specs are missing for image handling, so it's not clear when if ever J
is used. I have never encountered it, and it seems to be reserved for image text (?).
Thanks a lot, it looks great! I'll try to have a closer look, but I really lack some time those days. |
Hmm, I've just finished downloading Dirtybiology, the latest ZIM dated 2018-06, and I'm getting in both Edge and Firefox, "Video format or MIME type is not supported". The type of videos in this ZIM is "video/mp4" rather than webm. Now mp4 is definitely supported in Windows, though I guess the specific encoding could be a problem with my system, though I doubt it. I'll investigate... |
I was using an old ZIM of dirtybiology (2015-11) : I'll try later with 2018-06 |
OK, the videos on the sub-pages of that ZIM work, it's the video on the front page that doesn't. Looking at it in detail, it doesn't appear to have a source. It looks like it might be loaded dynamically in JavaScript. But mp4 videos on other pages in the ZIM seem to work OK on my system. To summarize, I think this is a "false alarm" based on the fact that the front page of the ZIM probably populates the video block with dynamic info. Maybe a random video, or "video du jour" type of thing... |
I completely forgot about subtitles / closed captions! I've now added them to the PR. I've tested on Edge with the latest dirtybiology ZIM and with I wondered if I needed to decode the files to utf8, but because they are being presented to the browser as URLs of uint8 arrays, the browser seems to handle that. It seems to have no problem with Arabic, Greek, etc. |
I'm experimenting with not using the DataView method of converting the uint8array to blob, even though it's theoretically faster. The reason is that a) it blocks Internet Explorer needlessly (I'm not sure why, because DataView is supposed to be fully supported from IE10); and b) I'm not actually convinced it offers a more efficient access to the underlying data in our case. In fact, I can't see any of the theoretical advantages from my quick tests -- the blob is loaded just as fast with either direct use of the uint8array or a DataView of its buffer. IE11 is able to play the videos in the recent TEDx ZIM |
There is a problem when there are too many subtitle tracks for a video (e.g. in TEDx ZIMs). The screen grab below is from latest global issues TEDx ZIM in Firefox, after pressing the captions button in the player. Only the subtitles from Danish to Russian can be selected, and the others are impossible to click. In Edge, only the top five subtitles are shown, and the others are also impossible to reach. Furthermore, subtitle files are highly compressed and it takes a very long time to decompress all of these (I've monitored them -- they carry on decompressing for at least a minute after loading the page). I've been thinking that all this decompression is a bit useless, when at most the viewer will want one of the subtitle languages. @mossroy @kelson42 , what would you advise in this situation? I could add a little box beneath the video,, or in a bootstrap modal at the bottom fo the screen, allowing the user to select which subtitle file to decompress, and insert just that one as the top track only when the user asks for it. It would make the requested language accessible and would save a lot of unnecessary decompression. But is this intervening in a problem that should be fixed at ZIM level? Should I make an issue somewhere for this? |
6a5a47a
to
32f2836
Compare
There is what I believe is a specific problem with accessing subtitles in Should we try to work around this, or put it down to a bad encode of this ZIM? Subsequent TEDx ZIMs I've tested don't have this problem, but I've only tested the global issues ZIM.
|
Thanks @kelson42 . The issue with decompression of lots of vtt files is just the usual one: decompressing text-based assets (html, js, css, svg, vtt) is "costly" and slow for our app, whereas extracting assets that are stored in the ZIM with no additional compression (jpeg, png, webm, mp4, etc.) is fast. So, when we load a video page from TEDx that has 30+ subtitle assets, it takes a lot of CPU to decompress them all. You can really feel a mobile device heating up, for example, and even on a high-spec laptop it can cause the fan to kick in. Since the user only needs one of the subtitle files, I was trying to think of ways to prevent unnecessary decompression. I think the display bug shown in screenshot above is a Firefox issue. |
@Jaifroid Yes, I rephrase my question: why do we have to unpack 30+ vtt files? I don't see any reason why this should be necessary! Only one vtt file decompression should be necessary to get on language. We have a bug somewhere IMO. |
@kelson42 The issue is that in jQuery mode, we decompress all assets with ZIM URLs on a page after we inject the page into the iframe's DOM. If I can find something like an I'll keep looking... Otherwise, I'm thinking a custom |
Some more info on events associated with text tracks here: https://stackoverflow.com/questions/50778257/texttracklist-onchange-event-not-working-in-ie-and-edge The suggested workaround doesn't work (I tested) because we don't have a valid subtitle URL (until we extract the file from the ZIM). EDIT: The track's |
So I've pushed a custom text track selection dropdown to this PR for testing. In Edge, IE11 and Chromium, the subtitles will show immediately on selection, even if the video is already playing. On Firefox, currently, it's necessary to press the CC button in the player after selecting the subtitiles. This can probably be automated. This is what it looks like visually: |
Did you try to listen to the addTrack event of textTracks ? |
@mossroy I assumed that would only fire when a track is added programmatically to the video block. The tracks are already in the video DOM on page load/render, so they're not being added. However, I'll test to see, in case my assumption was wrong. |
I would make the same assumption, but am not sure |
Unfortunately, the |
OK, we might be able to work around with a combination of an Clearly, Edge doesn't attempt to load the asset until the user selects it, and then reports an error instead of a change. Firefox and Chromium both report the error on DOM load, which suggests they might be trying to load the assets up-front. Difference in loading strategy seems to account for the difference in events here. The difficulty in working out a combination of onerror and onchange is that we'd need some browser detection, to prevent FF and Chromium from reacting to the onerror event, since it fires for each track on DOM load. And IE would be left with no support for subtitles. |
If I sum up the situation (please correct me if I'm wrong) :
Maybe this could be split into 2 different PRs : a first one which enables videos and subtitles (without optimization), and another one to test some optimizations. Regarding optimizations, my first reflex would be to work on #116 instead (compile libzim with emscripten). I know it will take more time to implement that, but it should improve performance and features of our ZIM files reading. I feel that our time would be better invested to work on that. Anyway, if you prefer to work on this optimization (which I would respect), I would prefer the second way (trying to use the events sent by the browser), because it looks more generic, and can not break the layout of the page. But I think we must not make some code that depends on the browser brand. Because we can not know how these browsers will evolve in the future. And it's not unusual that this kind of detection fails to do what we want : for example how should Firefox for iOS be detected? (It's a browser made by Mozilla, but that does not use the Gecko engine). See https://css-tricks.com/browser-detection-is-bad/ |
That's a very clear and correct summary of the situation @mossroy . I would quite like to be able to release a version of Kiwix JS (and Kiwix JS Windows) that can claim to support TEDx ZIMs, so it might be best to revert this PR to the version that extracts all the subtitles, and then (after any further tests) merge it. That will then provide base support on all browsers. I would also like to continue working on a solution (under a new issue / PR) for preventing extraction of all subtitles, because I think it's very close, and if I stop now, I'll probably forget what has already been tested. Besides, working on #116 just throws more power at the problem, and doesn't prevent the problematic behaviour of unnecessarily extracting 30 subtitle files when the user only needs one (which is actually a "bug"). From what I can tell, in SW mode the I completely agree that browser detection is bad. I'll think of how we could use capability detection instead. That actually could be quite easy with IE (as it seems to lack certain methods of the textTracks object). Thank you, this exchange has been very helpful and I can see a way forward. |
@mossroy This PR is now ready for your review (no rush, when you have time). I've reverted to the version that extracts all subtitle files after video load, and have cleaned up the code a little. It's tested and working in Edge, IE11, FF ESR and Chromium with |
I did not manage to choose subtitles on my Firefox OS device. But, as you said, it's probably because it might not be supported by the platform. But we don't care if subtitles do not work in Firefox OS, it's not worth working on it IMHO. It's already a miracle that videos work on these devices, even with only 512MB of RAM. On Firefox 63, there is an error message in the console, probably because it does not find the video source when the page loads (and before the media has been injected) :
But that's very minor. Except that, videos seem to work very fine on the platforms I tested (including subtitles) : Firefox 63, Firefox 63 as an extension, Chromium 70. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good too : nice work!
It would be very easy to have an option on the Configuration page: Set default subtitle language: xxxx. Then we could mark one of the languages as default and it might play on clients that don't have an interface for choosing subtitles.
I've seen the equivalent message in English. It's because we've changed
Yes, these are dynamic videos that are supposed to be filled with some (possibly random, or possibly "vidéo du jour") content. We could, in principle, put a random video in there, but the code would not be generic and would have to be checked on each ZIM update. I guess we'd be better off trying to get #152 [JavaScript in the ZIM] working acceptably. |
Thanks for these answers. So I would have left this PR as it is : don't do more work to comply with Firefox OS, or to avoid these error logs, or anything specific to some ZIM files. |
1555f15
to
91b290d
Compare
The evolution of this code is tracked in #172 . I believe the issues I identified there have been solved.
The code is generalized to work with
<video>
and<audio>
blocks, together with any<source>
tags contained within them. Hence I call it "media playback" rather than just "video". I have tested with video and audio provided in different ZIMs, but note that I did not find any ZIM using pureaudio
blocks -- the scrapers seem to put any audio in video blocks (and it plays fine like that). Details of some pages to test with in #172.Playback works in browsers without needing JavaScript-in-the-ZIM to be enabled, although sometimes getting to the content is a little challenging without JS (but basically our standard search usually works). I have tested the code with FFOS (works in simulator, actual device will depend on available codecs), Edge, Chromium, Firefox ESR, IE11 (no webm support), and UWP (works in Windows 10, no webm codec in Windows 10 Moible). For the browsers which don't support webm, it seems to degrade gracefully, indicating that there is no support.
One issue with TEDx ZIMs is that they contain a proprietary UI that uses extensive JS, including having links of the type
<a href="javascript:doSomeProprietaryFunction();">Go to top</a>
. Or else<body onload="raw javascript">
. These cause an exception when clicked, or in the worst case, cause an app crash on loading the page, and they won't work even in SW mode in most extensions. But this is a different issue. I just thought it should be signalled with regard to testing this PR.