Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VP6 video decoding by utilizing an external crate that bundles parts of FFmpeg #3004

Closed
wants to merge 3 commits into from

Conversation

torokati44
Copy link
Member

@torokati44 torokati44 commented Jan 31, 2021

This is very much a WIP POC RFC OMG PR.

(With #3117 merged, there aren't any "extra" commits in here anymore.)

Because this is (I feel like) a fairly heavy change set (both in technical and legal aspects), I'd like to get some feedback about it, before fully polishing it, and hopefully together an acceptable state for merging can be achieved.
While rough around the edges, it's already complete enough that it can be seen working on z0r.de: 7055, 7340, 7617, 7872, and many others.
(I'm hoping that deploying "unofficial forks" like this in the public is not problematic?)

Most functionality is in an external crate. The rest of the description applies to the contents of that separate crate.
At the moment, that crate lives in this repository: https://github.com/torokati44/vp6-dec-rs
In the future, this will be migrated under the Ruffle organization.

The C code is built by a fairly straightforward build.rs using the cc crate, which automatically handles native compilation or targeting WASM for the extension/selfhosted variant, using a static config.
For the WASM target, instead of relying on Emscripten or something similar, a (handmade) tiny fraction of a libc is included as well, just enough to make the decoder compile and work.
This same libc stub is also utilized on Windows - it was easier this way than supporting compilation with MSVC as well. - EDIT: This is no longer true.

Binding declarations are based on those generated by bindgen, but in the end, only a tiny portion of those were kept - and some other parts didn't work well then targeting WASM.
This adds clang as an additional dev dependency, which hopefully isn't too big an issue. Version 7 is for sure too old, 11 is confirmed recent enough - but at least not a whole Emscripten SDK is required.
I opted not to use any of the already available ffmpeg binding crates, because they all seemed to have issues with cross-compilation, and doing this without them wasn't that bad after all.

Some things that should be fairly straightforward to add, but aren't done yet:

  • Reporting which frames are and aren't keyframes, for artifact-free seeking.
    • EDIT: This turned out to be really easy to do.
  • Support for videos with transparency - ffmpeg has VP6A variant as well; or the two streams can be easily merged by hand.
    • EDIT: I have wired this up, it was fortunately only a matter of selecting a different codec, no manual merging needed. The only file I found containing VP6WithAlpha is the (slightly NSFW?) z0r.de/3926, but this one doesn't look like it is actually transparent anywhere at all. And one can build a test file from this example from Adobe as well.
  • Trimming the "padding pixels" off from the right and bottom edges, they are needed only for compression.
    • EDIT: When noting this, I mistakenly looked at the FLV tag spec instead of the SWF one. In the SWF tag header, the HorizontalAdjustment and VerticalAdjustment fields are not present. Also, I've checked, and even if the size in the header is not an integer multiple of 8 or 16 (the block sizes), the results already match that of Flash Player nicely. Most likely, the actual frame size is also part of the frame data fed to the decoder in FFmpeg, which then handles it properly.
    • UPDATE: I was wrong with the edit above, and it was only a coincidence that all the videos I looked at were fullscreen, so they were in fact cropped by the player window itself. Correctly draw videos that have different bounds than the size of their actual frame data #4154 exposed this issue, and made the garbage pixels visible. These are now cropped manually, for reasons explained in 2ac4acf3f18d0699e8596f847b8922e8804aa667.
  • Supporting the different post-processing (deblocking/deringing/smoothing) options
    • EDIT: Judging by the header parsing code, I believe that at least the deblocking filter is also configured by the encoded data itself, and FFmpeg applies it as requested.
  • Triple-checking that nothing in the countless unsafe blocks can crash or leak or do anything nasty.

@torokati44 torokati44 force-pushed the video-vp6 branch 2 times, most recently from e2fc8d9 to af20eac Compare February 1, 2021 07:47
@torokati44
Copy link
Member Author

Apart from getting the CI to pass on Windows, I might start to split this into a separate repo already, and instead of bundling libav code, refer to it as a git submodule.
Should be cleaner that way.

@QualityLow
Copy link

Great work! Having video support in Ruffle would do a lot!

@torokati44
Copy link
Member Author

Well, thank you, I sure am also looking forward to it myself, but there are certain standards that are better met by the code that gets accepted, and this isn't quite there yet... :)

@torokati44 torokati44 force-pushed the video-vp6 branch 8 times, most recently from 2c5984f to 44c44cf Compare February 4, 2021 23:43
@torokati44 torokati44 changed the title Add VP6 video decoding by bundling parts of libav Add VP6 video decoding by utilizing an external crate that bundles parts of libav Feb 4, 2021
@torokati44 torokati44 force-pushed the video-vp6 branch 2 times, most recently from 9a1c7a9 to ad23ba0 Compare February 5, 2021 00:09
@cyanreg
Copy link

cyanreg commented Feb 7, 2021

Hi, FFmpeg developer here.
This seems rather odd. The code in the MR/repo is Apache licensed, and it just looks like a Rust rewrite of the LGPL 2.0 licensed Libav code, but no attempt has been made to contact any of the original authors to relicense it, and no authorship credit is given either, which is rude to the people who've worked on it..
Furthermore, this is taken from the old dormant Libav project repository, which doesn't get any security updates, nor commits at all. All active Libav developers switched to working on the FFmpeg project quite a long time ago. and we extensively fuzz, maintain and optimize all our decoders. Although Rust guarantees memory safety, we also take care of timeout files which take too long to decode and freeze playback, consuming 100% CPU. This code carries over no optimizations or fixes ever made to the FFmpeg's decoder (since at least about 2012!), which could make us look somewhat bad as if we don't do any maintenance or improvements.
Finally, we have had support for alpha channel decoding in our tree for quite a long time now.

The nellymouser decoder was also based on our code, and although credit was given, it was relicensed without permission. Since it's not a lot of code and parts were taken from nelly2pcm I chose to not make an issue out of it, but this is a huge decoder.

EDIT: I noticed the new version of the PR links to Libav instead of oxidizing the C code. Linking to Libav is still going to result in a license violation, since it's LGPL licensed, and even then, why Libav and not something that's actually maintained?

@Herschel
Copy link
Member

Herschel commented Feb 7, 2021

Hi @cyanreg,

Certainly our intention is to obey all license requirements as required and give proper credit (simply none of this would be possible without the great work from FFmpeg and other projects). This specific PR has not been reviewed yet and is still a WIP; in the meantime I will take the public repo private until we can obey the proper licensing requirements. (@torokati44, can you look into resolving the issues mentioned above?)

As for Nellymoser, the original code is MIT licensed, and our Rust port is also MIT licensed. It includes the original copyright notice from the original authors. Our intention is to not relicense it, but keep it under the MIT license. I see we do not explicitly mention that this component is only MIT licensed in this Ruffle repo, where we link to the library; if we explicitly note that this component is only MIT licensed in this repo, and additionally copy over the MIT copyright notice into our LICENSE, is that satisfactory?

Thanks for the help and patience, I really do appreciate it; I'm trying my best to obey the licenses properly and make sure everything is correct.

@torokati44
Copy link
Member Author

torokati44 commented Feb 7, 2021

Hi!

it just looks like a Rust rewrite of the LGPL 2.0 licensed Libav code

I'm not sure what exactly are you referring to here. All libav code is (since about two days ago) referenced verbatim, as a git submodule - it's not even directly included in the repo itself. I thought doing this was somewhat of a common practice. Of course I'm happy to give all credit and attribution for the decoder to the actual authors. In what form do you think this would be done best?

Also, no rewriting of any C code into Rust was done here, only binding declarations and other glue added.

All active Libav developers switched to working on the FFmpeg project quite a long time ago.

Huh, I was still under the impression that the libav fork went on being the more active. I guess we should switch over to that project then! Unless you have any objections, of course.

This code carries over no optimizations or fixes ever made to the FFmpeg's decoder (since at least about 2012!), which could make us look somewhat bad as if we don't do any maintenance or improvements.

The currently referenced commit is from April 2019. And indeed, it seemed somewhat old to me, but not that old, and I just thought it was because it is "done" - I was wrong...

Finally, we have had support for alpha channel decoding in our tree for quite a long time now.

Yes, this I even mentioned in the original description! It just wasn't wired up yet.

The nellymouser decoder was also based on our code, and although credit was given, it was relicensed without permission.

That was entirely @relrelb's work, I know ~nothing about it.

@torokati44
Copy link
Member Author

torokati44 commented Feb 7, 2021

EDIT: I noticed the new version of the PR links to Libav instead of oxidizing the C code.

There wasn't any "oxidation" done by me to any libav code at any point in time. I simply started out with copying unmodified C source files from libav into my fork, compiling them with clang and linking them into Ruffle. The copied files are not here anymore, because:
Then I split out the entire thing into a separate repo, where I then switched over to the git submodule approach instead of copying. Since the repo under the Ruffle organization is made private, here is most of what went into that (purely for reference, so we all see what we're talking about): https://github.com/torokati44/vp6-dec-rs
EDIT: I also restored the reference in this PR to that external repo, for the time being, so at least it still works.

Linking to Libav is still going to result in a license violation, since it's LGPL licensed

So, is there even any way to include any libav (or ffmpeg!) decoders into Ruffle, with its current license?

why Libav and not something that's actually maintained?

See my previous comment, I had the wrong impression about the relative activity of these projects. Seeing your points, now I think that - if anything - ffmpeg should be used instead.

@Herschel
Copy link
Member

Herschel commented Feb 7, 2021

Linking to Libav is still going to result in a license violation, since it's LGPL licensed

So, is there even any way to include any libav (or ffmpeg!) decoders into Ruffle, with its current license?

If my understanding is correct, we should be able to link to an LGPL library in our MIT+Apache2 licensed project due to the linking exception provided by LGPL. The VP6 repo (oxidized or not) must be licensed under LGPL. When we link to it, we can specify that this dependency is LGPLv3 and include the LGPL+GPL license text in this repo's LICENSE.md.

See my previous comment, I had the wrong impression about the relative activity of these projects. Seeing your points, now I think that - if anything - ffmpeg should be used instead.

Yes, we should use FFmpeg's decoder instead.

@cyanreg
Copy link

cyanreg commented Feb 7, 2021

What an utter mess.

This PR contains the H263 Rust decoder (WHY?!). Since VP6 has some common parts with H263 I thought this PR implemented a Rust version of a VP6 decoder. I was not looking at the commits, only at the full diff. It didn't help the H263 code contains few references to H263 in its own code, so it's not easy to tell from just a random chunk of IDCT and coefficient parsing code.

In which case, since no code is taken, I'm fine with it. Just has to comply with the LGPL the wrapper is linking to.
As far as I know, the LGPL does require that the user is able to replace the library used. So as long as the wrapper itself satisfies this requirement (by being LGPL itself), then it's fine.
Would be nice if the wrapper could link to the system libraries when the project is compiled as a native binary. We keep the API stable and only change it on major bumps.

All active Libav developers switched to working on the FFmpeg project quite a long time ago.

Huh, I was still under the impression that the libav fork went on being the more active. I guess we should switch over to that project then! Unless you have any objections, of course.

This hasn't actually been the case at any point AFAIK.

This code carries over no optimizations or fixes ever made to the FFmpeg's decoder (since at least about 2012!), which could make us look somewhat bad as if we don't do any maintenance or improvements.

The currently referenced commit is from April 2019. And indeed, it seemed somewhat old to me, but not that old, and I just thought it was because it is "done" - I was wrong...

We have dozens of daily commits and thousands of monthly emails on the mailing list.
We also get regular license violations, which we keep track of but can't do anything about.

@torokati44
Copy link
Member Author

What an utter mess.

It is marked as DRAFT for a reason. :)

This PR contains the H263 Rust decoder (WHY?!).

This is explained in the original description. The only reason for this is that the base video infrastructure itself (that this PR depends on) was developed closely together with the H.263 decoder in #2173, and neither is merged yet.

Now that the non-decoder parts were separated into #3117, only that would need to be included in this PR. And after it is accepted, not even that.

By the way, AFAIK the H.263 decoder was written entirely from scratch, not using any ffmpeg/libav code, only the spec itself. But @kmeisthax can correct me if this is not the case.

I thought this PR implemented a Rust version of a VP6 decoder. I was not looking at the commits, only at the full diff.

Understandable, this really isn't that clean at the moment.

In which case, since no code is taken, I'm fine with it.

Great!

As far as I know, the LGPL does require that the user is able to replace the library used. So as long as the wrapper itself satisfies this requirement (by being LGPL itself), then it's fine.

At the moment the wrapper compiles the referenced libav/ffmpeg C sources (not all of them, just what's needed for VP6) into itself during build. So, essentially, links to them statically. This makes it impossible to swap the wrapped decoder out after it is built - I am, however, completely fine with the wrapper also being LGPL.

Would be nice if the wrapper could link to the system libraries when the project is compiled as a native binary. We keep the API stable and only change it on major bumps.

While I agree, and have also considered this myself, I wouldn't really expect ffmpeg to be installed and available on most systems (except for Linux), and since this would slightly complicate the build process, I don't consider this a priority. In the future, I might explore this option as well.

We have dozens of daily commits and thousands of monthly emails on the mailing list.

And we are grateful for your efforts, and certainly wouldn't want to diminish them!

We also get regular license violations, which we keep track of but can't do anything about.

I'm sorry about that. We'll try our best so Ruffle won't ever make it onto this list. :)


Another, slightly unrelated note: This wrapper will also include qsort.c from musl-libc, but as that is already MIT licensed, hopefully this should be only a matter of an additional copyright notice.

@cyanreg
Copy link

cyanreg commented Feb 7, 2021

As far as I know, the LGPL does require that the user is able to replace the library used. So as long as the wrapper itself satisfies this requirement (by being LGPL itself), then it's fine.

At the moment the wrapper compiles the referenced libav/ffmpeg C sources (not all of them, just what's needed for VP6) into itself during build. So, essentially, links to them statically. This makes it impossible to swap the wrapped decoder out after it is built - I am, however, completely fine with the wrapper also being LGPL.

If it's possible to swap the wrapper itself (which would contain the compiled decoder code), then it's fine. That's why I mentioned the wrapper ought to be LGPL.

Would be nice if the wrapper could link to the system libraries when the project is compiled as a native binary. We keep the API stable and only change it on major bumps.

While I agree, and have also considered this myself, I wouldn't really expect ffmpeg to be installed and available on most systems (except for Linux), and since this would slightly complicate the build process, I don't consider this a priority. In the future, I might explore this option as well.

Make it a configure-time option or just autodetect it? You're already using the proper external API, so it's just a matter of not building the internal code and instead linking to the system library.
That way the wrapper code could be MIT or Apache licensed when it's purely used as a wrapper that contains no compiled ffmpeg code.

@kmeisthax
Copy link
Member

I can confirm that the ruffle_codec_h263 crate in #2173 is 95% mine - the other 5% was bug fixes from @torokati44 that took it from sub-1fps and full of motion vector errors to nigh-flawless.

@torokati44
Copy link
Member Author

FYI: I rebased this PR on top of #3117, this time without squashing its commits.
The H.263 decoder is no longer in here.

@torokati44 torokati44 changed the title Add VP6 video decoding by utilizing an external crate that bundles parts of libav Add VP6 video decoding by utilizing an external crate that bundles parts of ffmpeg Feb 8, 2021
@torokati44 torokati44 force-pushed the video-vp6 branch 2 times, most recently from 9afb239 to 0acd3b6 Compare August 23, 2021 00:05
@torokati44
Copy link
Member Author

torokati44 commented Aug 23, 2021

So.

I have updated the FFmpeg submodule of the decoder crate to current master, and everything still seems to work just fine.
I have fixed the CI setup in the repository of the crate, it now passes in every OS/toolchain combination, both building natively and for web.

Once the web tests on macOS are re-enabled in the main Ruffle repo, some tricks will have to be taken over from there (installing clang, binutils, and llvm-tools from conda-forge, and setting AR=llvm-ar).

After rebasing, and adhering to the new way of doing this (such as impl VideoDecoder, and putting all of this behind a feature), I have simplified the commit history greatly. And the actual additions in here are not that big in size.

@adrian17 had concerns about the .wasm module containing unnecessary stuff from the C build. The list of files included in the custom build was constructed in an additive way, adding files containing the missing symbols one by one, so there shouldn't be too many files that are there, but aren't necessary. However unlikely this seems, for example, in the case of sha.c.
Keep in mind that the linkers for different platforms might allow unresolved symbols to varying degrees, with the one on Windows (link.exe) being among the pickier ones.
And honestly, I'd really rather leave dealing with this to tools whose job this actually is, like clang and/or wasm-opt.

I'd gladly hand over this repo (https://github.com/torokati44/vp6-dec-rs) to the organization, so https://github.com/ruffle-rs/vp6-decoder-libav can be deleted.

Instructions for installing some additional dev dependencies should be added to the Wiki after merging this.

I'm now removing the DRAFT marker from this PR.

@torokati44 torokati44 marked this pull request as ready for review August 23, 2021 00:35
@torokati44 torokati44 force-pushed the video-vp6 branch 2 times, most recently from 56eb4f6 to f67b16f Compare August 23, 2021 21:55
@torokati44
Copy link
Member Author

torokati44 commented Aug 25, 2021

Update: The .wasm size increases by about 700k as a result of this PR. About 470k of this comes from libswscale alone.
Since there's no way (that I found) to pick just the single conversion we need, I think it would make sense to simply drop libswscale, and use the same (in-house) colorspace conversion code as the h263 decoder - at least when targeting web.
This has performance implications, but that code could use some optimization anyway.
This would potentialy also allow us to skip the manual frame trimming in here, and ask the decoder itself to crop the YUV frame right away (libswscale needs some padding, our own converter doesn't - at least now).

@torokati44 torokati44 force-pushed the video-vp6 branch 2 times, most recently from 553bfdf to 9e7b954 Compare September 14, 2021 21:50
@torokati44
Copy link
Member Author

Do you happen to know, @cyanreg, that in case of VP6, which variant of 4:2:0 chroma subsampling is used? I'm asking specifically about the location of the chroma samples.

I suppose it's one of the bottom ones on the figure in this section:
https://docs.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering#yuv-sampling
And for many formats/codecs, this is clearly defined. I see no mention of it anywhere for VP6.

FFmpeg sets/leaves AVFrame::chroma_location to/at AVCHROMA_LOC_UNSPECIFIED on the frames it outputs.... Which one is assumed in this case when converting to RGB?

@cyanreg
Copy link

cyanreg commented Sep 16, 2021

VPX until VP9 came from the wild west era before web colorspace standardization where any coefficients that sort of made yuv were yuv. Including labelling YCgCo as YCbCr. They really didn't care much about chroma location, and who knows what primaries/colorspaces/transfer functions they actually used.

VP8 was the first to specify the colorspace that ought to be used, because it started to matter, and seeing as how it's closely following VP6, my guess is it's the same. VP8 used BT.601 primaries/colorspace/transfer function, which is the same as JPEG, but whilst JPEG defined centered chroma samples (FFmpeg has a neat illustration and description), centered chroma requires interpolation, which is expensive on CPUs, so I'm very sure they'd have used top-left chroma, seeing as that's what the popular codecs of the day used, and therefore video editors and devices, so all sources. As it was the wild west, I seriously doubt they'd process the chroma to JPEG-style centered samples.
Also, VP8 mostly carried limited-range BT.601, which is not like JPEG's full-range, and was used in MPEG2, so that supports my theory that they did no processing on chroma when transcoding.

@torokati44
Copy link
Member Author

torokati44 commented Sep 16, 2021

Okay, I see, thanks!

I also found that ASCII art comment in FFmpeg, and I too think it's really nice and useful. :)

And yeah, the SWF spec mentions that VP6 uses the same reduced range BT.601 colorspace as H.263, so we can use that. But then again, it also says that H.263 is 4:2:2 chroma subsampled, which it isn't... Oh well, it's not the most flawless spec, nor is this the most straightforward notation, if you ask me.

At least the colors look entirely the same to me when doing the conversion with the code made for H.263, as they do in Flash Player, or any media player using FFmpeg (including this branch here).

So, basically, my understanding is:

  • For simplicity's sake, we can tack on the same (in-house) colorspace converter after the VP6 frames that we made for H.263. It does bilinear interpolation of chroma, treating chroma samples as being in the center of the 2x2 groups around them. And since it's not specified for VP6 anywhere, some content might actually be encoded with the four chroma samples averaged, so doing this would be close to "correct" in these cases. Well, as much as a bilerp can be - but at least not shifting colors in any direction.
  • To save a whole bunch of CPU cycles, by sacrificing slightly on quality, we could also just duplicate the single chroma sample for all 4 corresponding pixels, without any interpolation, because it would be as good of a guess as any...
  • Or even do [bi]linear interpolation as if the chroma samples were in the [top]left, to regain some visual quality.

Is this correct?

@cyanreg
Copy link

cyanreg commented Sep 17, 2021

You should always do bilinear interpolation to upscale chroma to luma's resolution. Option 2 is a no-go, because nearest-neighbour upscaling on anything looks really really bad to anyone who notices.
All that chroma position does is that it offsets the points at which you sample from your chroma texture to upscale it to luma's resolution just before you convert it to RGB. E.g. for centerd chroma samples, instead of sampling from res = texture(chroma_plane, vec2(x, y)); (top-left chroma), you'd sample from res = texture(chroma_plane, vec2(x + 0.5, y + 0.5)); (centered). The difference is subtle at 1:1 resolution (and very jarring at high resolutions, where you can see the misalignment better), but more correct, so if possible, you'll want to do the former instead of the latter.

MPEG-4 (H.263) uses left-aligned chroma (so vec2(x, y + 0.5)) for 4:2:0 subsampling, so I'm not sure why your current code does center-sampled chroma, maybe you should check it out, because it doesn't sound correct. Like I said, when upscaling from the source resolutions of vp6 in flash to modern resolutions, minor chroma misalignment can cause weird halo effects around edges and such.

@torokati44
Copy link
Member Author

MPEG-4 (H.263) uses left-aligned chroma (so vec2(x, y + 0.5)) for 4:2:0 subsampling

Um, I don't think this is true for H.263. The FFmpeg snippet you linked has this line in it: AVCHROMA_LOC_CENTER = 2, ///< MPEG-1 4:2:0, JPEG 4:2:0, H.263 4:2:0.
And the H.263 spec shows:

The difference is subtle at 1:1 resolution (and very jarring at high resolutions, where you can see the misalignment better) ...
Like I said, when upscaling from the source resolutions of vp6 in flash to modern resolutions, minor chroma misalignment can cause weird halo effects around edges and such.

Our colorspace converter doesn't do any scaling, so perhaps we can avoid the really bad cases even if we don't do it entirely correctly. That will be handled later, by the renderer, which is given an RGBA texture in source resolution, and transforms it however it likes.
Although there were ideas to give the renderer the YUV frames directly, as this could speed up the conversion considerably on certain backends - especially when done in a shader, of course. But right now, this is not how it's done, so a 1:1 software converter is what we need.

@cyanreg
Copy link

cyanreg commented Sep 17, 2021

Um, I don't think this is true for H.263. The FFmpeg snippet you linked has this line in it: AVCHROMA_LOC_CENTER = 2, ///< MPEG-1 4:2:0, JPEG 4:2:0, H.263 4:2:0.
And the H.263 spec shows:

Err, right, I got it wrong.

Our colorspace converter doesn't do any scaling, so perhaps we can avoid the really bad cases even if we don't do it entirely correctly. That will be handled later, by the renderer, which is given an RGBA texture in source resolution, and transforms it however it likes.

No, that's by far the worst - if you're blowing up a pre-converted image with misaligned chroma, it'll look worse than blowing up all planes to display resolution before conversion, and converting them to RGB. In the first case, your chroma offset error of 0.71 pixels (sqrt(0.5**2 + 0.5**2)) will be multiplied by display_resolution/video_resolution (for example, 8 pixels for 240p video on a 4k output), whilst in the second case it'll be subtle and hard to notice a shift of 0.71. The third, the correct and cheap way is to just upscale the chroma planes to the luma resolution and then upscale the RGBA texture. Bilinear^2 isn't that much worse than a single bilinear scale, though doing the yuv->rgb conversion at display resolution correctly would obviously be the best.

Although there were ideas to give the renderer the YUV frames directly, as this could speed up the conversion considerably on certain backends - especially when done in a shader, of course. But right now, this is not how it's done, so a 1:1 software converter is what we need.

Maybe you should look into changing the software converter? It shouldn't be that much work to add a configurable chroma offset and change the sampling weights (if it's using something like a kernel, though it probably isn't for bilinear).

@torokati44
Copy link
Member Author

torokati44 commented Sep 17, 2021

The third, the correct and cheap way is to just upscale the chroma planes to the luma resolution and then upscale the RGBA texture. Bilinear^2 isn't that much worse than a single bilinear scale ...

Yep, this is exactly what I'm aiming for. Although Flash has a property that can switch the RGBA scaling between nearest and bilinear (AFAIK), but I'm fairly certain that that's done after the colorspace conversion, so not really in scope here.

Maybe you should look into changing the software converter? It shouldn't be that much work to add a configurable chroma offset and change the sampling weights (if it's using something like a kernel, though it probably isn't for bilinear).

Originally, the converter for H.263 was based on a naive and generic float lerp(float a, float b, float t), but since then, in the name of speed on WASM, it got special-cased to a fixed t = 0.25, and just a couple integer additions and bit shifts.
And I took great care to do the interpolation of chroma to the luma locations correctly, to avoid ugly chroma shifting.

Now I'm trying to reuse/adapt that code to do the conversion for frames decoded from VP6 (with the appropriate interpolation), and this brings us back to my original question again: what is the correct way here?
This is exactly why I really tried hard to find/figure out the chroma sample locations of VP6 in the first place, so I can do the interpolation right, without introducing any chroma shifting.

Do I have to resort to tediously observing the rendering of manually crafted test videos, and comparing the different methods, to see which one matches Flash Player? :/

@ghost
Copy link

ghost commented Sep 19, 2021

https://code.videolan.org/mahanstreamer/nihav/-/blob/master/nihav-duck/src/codecs/vp6.rs might be helpful.

Even written in rust to top off the cake

@torokati44
Copy link
Member Author

torokati44 commented Sep 19, 2021

Yeah, this looks wonderful! Thanks!
The AGPL license is problematic though. I see mention of possible relicensing upon request in the README.
Do you think that the VP6 decoder (and any other infrastructural code needed to run it) could be relicensed to at least LGPL?

@torokati44
Copy link
Member Author

And about the chroma sample positions: It looks to me now that Flash Player just uses the same chroma sample for the whole 2x2 pixel group, without any interpolation, for both H.263 and VP6 videos, so maybe I shouldn't worry about this at all, and do the same thing. It will be a lot faster, and similar to Flash Player, even if not as pretty...
Might be worth keeping the interpolation (at least for H.263) as an option.

@ghost
Copy link

ghost commented Sep 19, 2021

Yeah, this looks wonderful! Thanks!
The AGPL license is problematic though. I see mention of possible relicensing upon request in the README.
Do you think that the VP6 decoder (and any other infrastructural code needed to run it) could be relicensed to at least LGPL?

Yes, my friend has reached out to the author, waiting for a response.
If anything goes awry, he has relicensed his the code to rust-av under N-clause BSD (https://nihav.org/intro.html), so you can always use it if its upstreamed there.

@ghost
Copy link

ghost commented Sep 19, 2021

Apparently he expressed implied approval, but i would ask him at https://codecs.multimedia.cx/2021/09/vp6-simple-interframe-encoder-done/#comments to double make sure.

…ndency

With the "allow-lgpl" feature enabled on it.
With a new VideoDecoder trait implementation, including
VP6WithAlpha decoding, proper FrameDependency reporting, and
cropping the unwanted encoded pixels on the right/bottom manually.
By adding MSYS to $PATH (for clang), and setting up the MSVC
development environment in the shell (for link.exe).
@torokati44
Copy link
Member Author

An alternative, #5369 was merged instead of this.

@torokati44 torokati44 closed this Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants