split yuv into its own pseudo codec #328

totaam · 2013-04-30T15:23:44Z

Issue migrated from trac ticket # 328

component: core | priority: minor | resolution: fixed

2013-04-30 15:23:44: totaam created the issue

Why?:

cleaner, the code in the 264/vpx codecs is getting messy

we don't necessarily need yuv when using x264/vpx with the opengl client (though we would need to tweak the window backing logic to always choose opengl windows if yuv2rgb is not available)

make easier to tune the colourspace conversion step (downsampling/scaling etc)

Attached is a test patch (does not even compile) which shows how to get started. I don't have time to take this further right now..

totaam · 2013-04-30T15:24:09Z

2013-04-30 15:24:09: totaam uploaded file `yuv-codec.patch` (15.5 KiB)

work in progress yuv codec

totaam · 2013-04-30T15:39:31Z

2013-04-30 15:39:31: ahuillet commented

I think the "co" should also be split from the "dec".

totaam · 2013-05-04T14:57:20Z

2013-05-04 14:57:20: antoine changed status from new to assigned

totaam · 2013-05-04T14:57:20Z

2013-05-04 14:57:20: antoine changed owner from ** to antoine

totaam · 2013-05-04T14:57:20Z

2013-05-04 14:57:20: antoine commented

splitting encoder from decoders is done (that part was very easy) in r3265

totaam · 2013-05-17T14:36:24Z

2013-05-17 14:36:24: antoine uploaded file `yuv-codec-v2.patch` (9.6 KiB)

better patch - no longer using a yuvlib, all in cython

totaam · 2013-05-26T10:40:44Z

2013-05-26 10:40:44: ahuillet commented

The plan is: separate video backends into modules -

csc_* for colorspace conversion (csc_swscale, csc_nvcuda coming one day)
enc_* for encoding (enc_x264)
dec_* for decoding (dec_avcodec)

The patches on the C parts are done, with an improved API, and kept at https://github.com/ahuillet/video_backends.

Now we need to re-do the .pyx interfaces to make this work again.

totaam · 2013-05-26T10:58:59Z

2013-05-26 10:58:59: ahuillet commented

Quirks:

csc_ is able to support any colorspace, but enc_ and dec_ have not had that piece of functionality readded yet.

the logic for changing colorspace has been removed from enc_x264 because it's not its place, it will have to be reimplemented in python

Memory management:

you allocate a context with init_* functions like before, you clean up the context with free_* functions like before, but those functions do not free the context, you have to call free(context) manually

pass { NULL, NULL, NULL } to csc_image functions - they allocate their own memory, you free it by calling free_csc_image({})

enc_x264 allocate its own memory and you don't free it afterwards (it reuses the same buffer over and over again)

I am not sure about dec_avcodec, will check.

totaam · 2013-05-26T16:18:58Z

2013-05-26 16:18:58: totaam commented

We also need to deal with reads beyond the size of the buffer caused by swscale reading 8 (or more?) bytes at a time - valgrind output:

==8159## Thread 6:8159## Invalid read of size 88159##    at 0xF7580E1: ??? (in /usr/lib64/libswscale.so.2.1.101)8159##    by 0xF750038: ??? (in /usr/lib64/libswscale.so.2.1.101)8159##    by 0xF756465: sws_scale (in /usr/lib64/libswscale.so.2.1.101)8159##    by 0xFFFC441: csc_image_rgb2yuv (x264lib.c:502)8159##    by 0xFFF71B5: __pyx_pw_4xpra_6codecs_4x264_7encoder_7Encoder_25compress_image (encoder.c:3033)8159##    by 0x35B1EDD280: PyEval_EvalFrameEx (ceval.c:4098)8159##    by 0x35B1EDCEF0: PyEval_EvalFrameEx (ceval.c:4184)8159##    by 0x35B1EDDCBE: PyEval_EvalCodeEx (ceval.c:3330)8159##    by 0x35B1E6D925: function_call (funcobject.c:526)8159##    by 0x35B1E49C0D: PyObject_Call (abstract.c:2529)8159##    by 0x35B1EDA08A: PyEval_EvalFrameEx (ceval.c:4411)8159##    by 0x35B1EDDCBE: PyEval_EvalCodeEx (ceval.c:3330)8159##  Address 0x1acea03c is 630,732 bytes inside a block of size 630,736 alloc'd8159##    at 0x4A0887C: malloc (vg_replace_malloc.c:270)8159##    by 0x35B3A287BB: XGetImage (GetImage.c:82)8159##    by 0x158E52C6: __pyx_pw_4xpra_3x11_7gtk_x11_12gdk_bindings_13PixmapWrapper_3get_image (gdk_bindings.c:6555)8159##    by 0x35B1E49C0D: PyObject_Call (abstract.c:2529)8159##    by 0x35B1EDA08A: PyEval_EvalFrameEx (ceval.c:4411)8159##    by 0x35B1EDCEF0: PyEval_EvalFrameEx (ceval.c:4184)8159##    by 0x35B1EDDCBE: PyEval_EvalCodeEx (ceval.c:3330)8159##    by 0x35B1EDC2B2: PyEval_EvalFrameEx (ceval.c:4194)8159##    by 0x35B1EDDCBE: PyEval_EvalCodeEx (ceval.c:3330)8159##    by 0x35B1EDC2B2: PyEval_EvalFrameEx (ceval.c:4194)8159##    by 0x35B1EDDCBE: PyEval_EvalCodeEx (ceval.c:3330)8159==    by 0x35B1EDC2B2: PyEval_EvalFrameEx (ceval.c:4194)

totaam · 2013-05-27T12:31:28Z

2013-05-27 12:31:28: ahuillet commented

New backend infrastructure WIP at https://github.com/ahuillet/xpra/commits/

Things to do :

re-add nogil

fix xmemalign() definition and calls

colorspace other than 420 need to be de-hardcoded

OpenGL backend seems to go through CSC anyway

packaging, debs, rpms - patches directory

win32 build

vpx

feed RGB to x264 directly instead of YUV444

quality # 100> ultrafast in x264

totaam · 2013-05-28T10:09:59Z

2013-05-28 10:09:59: totaam uploaded file `yuv-codec-v3.patch` (93.7 KiB)

latest work in progress

totaam · 2013-05-31T12:28:41Z

2013-05-31 12:28:41: totaam commented

Please pull the latest changes I've sent, updated list of remaining issues (help needed - although not all of these issues need to be dealt with before we merge into trunk, most of them will need doing at some point):

edge resistance: take current setting into account, so the current one (or similar ones that re-use csc or video encoder) will get a score boost

fire pipeline changes from timer thread if new best option wins (set new quality/speed then re-evaluate pipeline)

move ImageWrapper to global location (no longer a server-only class)

free_csc is a simple free(),s o no real need for the currently very ugly code: merge with XImageWrapper code which does do a free()
Also, maybe we want/have to use separate memaligned buffers for each plane?

handle scaling and odd dimensions problems

expose dimensions mask (0xFFFE is for x264 only?)

BGRA does not work for vpx or x264!?: segfault! - disabled by #ifdef for now (may also need rowstride work for uneven sizes?) - and even if it does work, this also affects:

with other csc modes than YUV420P, YUV422P, YUV444P, say BGRA: what should we expose to the client? does it care about the mode that was used or about the actual pixel dimensions used during compression. maybe we should not expose the option "csc=YUV422P" but something like: "csc_divs=(1,1),(2,1),(2,1)" instead? and maybe this ties with scaling too

related to above: old clients without "encoding_client_options" option could still parse pixel data as long as the divs are the same? (not that we would want to care and handle this particular case - just a note)

ensure we zero copy pixels instead of copying from decoder/encoder, then:

restore nogil: verify that we are passing the same pointers via string wrappers and that we are allowed to manipulate them without the gil.

fix opengl (no idea why the output is blank - the yuv codepath works for the non-gl code..)

swscale: could be tuned using speed/quality too: use FAST_BILINEAR if quality=0 and speed=100, do re-init in set_quality/set_speed - which could be combined as: configure(quality, speed)

think about target bitrate mode - no real difficulty there?

vpx: use vpx_img_wrap to ensure we don't copy the pixels

client side: use similar code to choose the decoder path, and maybe choose window backing (acceleration apis generally require direct access to the backing window)

dec_avcodec should be able to handle vpx: make the decoder export the list of codec formats it handles (x264, vpx, etc)

max width and height for csc (ffmpeg ML related message), vpx and x264: find the actual limits from somewhere (API?) rather than hard-coding to 4k*4k

rename x264 to h264 as encoding option (and keep "x264" in encodings for backwards compatibility)

move memalign to lib with pxd for importing into python code when we create image buffers

encoder/csc init: probe and maybe test speed (also useful to test it does work!) - ie: could test csc a 1k*1k random/test buffer to figure out latency

vpx cannot do speed or quality for now: we should skip the calculations! (keep caps telling us what we can do for the current encoder)

maybe add command line options for enabling/disabling/prioritizing encoder backends

if window size has changed just a little, we could send the edges as per the one-pixel workaround (which may well need generalizing anyway when dealing with thins like scaling), at least a few times before doing the full re-init which sends a new (expensive) iframe. And by that point the window may have been resized some more - so we don't lose video delta benefits.

the encoder specs should be static: the pipeline class should have a single copy of all encoder specs for all window sources. Then:

the encoder specs can have guesstimates and for some enc/csc, and runtime values too: we return objects, so these can be updated continuously by the encoder/csc

setup cost should be dynamic: decrease as hardware encoding availability decreases (so we then become even less likely to use hardware for small windows)

rename rgb_format to pixel_format

totaam · 2013-05-31T14:13:20Z

2013-05-31 14:13:20: ahuillet commented

Pull - from where?

totaam · 2013-06-03T12:55:17Z

2013-06-03 12:55:17: totaam commented

Mostly done in:

r3540, r3541, r3542, r3543, r3544: preparatory work

r3545: the big one, lots of details in commit message

r3546, r3547 and r3548: build fixes

Not closing this ticket yet, what is left to do:

handle dimensions restrictions generically, and for scaled down dimensions too: expose it as an encoder spec property (so x264 can have its even window dimensions)

move ImageWrapper and maybe refactor it with the XImageWrapper

zero-copy pixels in and out (also look at vpx_img_wrap for vpx)

nogil option on non-python buffers

let client backing class tell the server if it can use scaling (currently just per client)

let client backing class tell the server which csc modes are allowed (currently just per client)

handle scaling in GL client code

handle csc client side for GL rendering (as GL cannot handle ARGB - but we may still want to use ARGB server side to reduce overall latency)

And maybe also:

swscale could tune the speed/flags at runtime by doing a full re-init (which is fast/cheap)

handle transient encoder failures: mark for retry/not

when trying out new pipelines, keep the old one around in case we need to fallback to it (the new one does not work out)

totaam · 2013-06-06T16:46:02Z

2013-06-06 16:46:02: totaam changed status from assigned to new

totaam · 2013-06-06T16:46:02Z

2013-06-06 16:46:02: totaam changed owner from antoine to ahuillet

totaam · 2013-06-06T16:46:02Z

2013-06-06 16:46:02: totaam commented

More improvements::

imagewrapper changes in r3552

nogil restored in r3572

less pixel copying in r3570 and r3568

handle size restrictions generically r3588
and lots of cleanups in between.
[[BR]]

Which only leaves needing fixing for this ticket:

the OpenGL backing (ideally, having the ability to use RGBP pixel data directly, handling scaling would also be nice)

dec_avcodec needs to let us manage the buffers as we see fit #350 smarter buffer allocation/free code

Other things we can deal with later (see also lists in previous comments):

finding real csc/encoder limits at runtime (would be worth at least testing what works and what doesn't to make sure the codec_spec is more or less accurate - finding out later when we try to use it is more expensive)

client_backing could tell the server which csc modes it will handle (alternative to fixing gl backing to handle RGBP)

totaam · 2013-06-10T13:58:07Z

2013-06-10 13:58:07: totaam commented

and... scaling brought back an old bug:
non-existing PPS 0 referenced
After spending a lot of time debugging and inspecting the server-side code (which looks fine), I managed to reproduce it with client side debugging switched on and found:
paint_with_video_decoder: window dimensions have changed from (421, 374) to (420, 374)
paint_with_video_decoder: new <type 'xpra.codecs.dec_avcodec.decoder.Decoder'>(420,374,YUV420P)
paint_with_video_decoder: info={'width': 420, 'colorspace': 'YUV420P', 'type': 'x264', 'height': 374}
[h264 @ 0x7f2b8836af00] non-existing PPS 0 referenced
[h264 @ 0x7f2b8836af00] decode_slice_header error
[h264 @ 0x7f2b8836af00] no frame!
Error while decoding frame
2013-06-10 19:04:51,472 draw error
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 1004, in _do_draw
    window.draw_region(x, y, width, height, coding, data, rowstride, packet_sequence, options, [record_decode_time])
  File "/usr/lib64/python2.7/site-packages/xpra/client/client_window_base.py", line 258, in draw_region
    self._backing.draw_region(x, y, width, height, coding, img_data, rowstride, options, callbacks)
  File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 348, in draw_region
    self.paint_with_video_decoder(x264_Decoder, "x264", img_data, x, y, width, height, options, callbacks)
  File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 258, in paint_with_video_decoder
    coding, len(img_data), width, height, options))
Exception: paint_with_video_decoder: x264 decompression error on 1044 bytes of picture data for 420x374 pixels, \
    options={'quality': 29, 'frame': 2, 'speed': 100, 'csc': 'YUV420P'}
For the record, what is happening here is that we were sending the raw dimensions of the XImageWrapper we encode from as being the dimensions we can paint with, and these new dimensions could get clipped by x264's size restrictions (even dimensions only) to the same size as the current encoder pipeline (ie: when resizing a window to a size one pixel shorter) which means the current encoding pipeline was still valid, but this caused the client side code to erroneously think that it needed a new decoder pipeline to deal with the packet... (cleaning the old pipeline, causing the new video packet to reference a frame that has been discarded).
A somewhat ugly solution is to pass the real dimensions via client_options, see attached patch. A better solution will be committed.

totaam · 2013-06-10T13:58:44Z

2013-06-10 13:58:44: totaam uploaded file `actual-video-size.patch` (2.2 KiB)

use the correct video size when decoding (hackish solution)

totaam · 2013-06-17T13:22:12Z

2013-06-17 13:22:12: totaam changed status from new to closed

totaam · 2013-06-17T13:22:12Z

2013-06-17 13:22:12: totaam changed resolution from ** to fixed

totaam · 2013-06-17T13:22:12Z

2013-06-17 13:22:12: totaam commented

The correct solution for the client decoding error was merged in r3607
(note: this was not caused by scaling after all)

Also merged a workaround for GL window's lack of RGB support in r3612 (+ r3617 for OR windows)

As for finding the encoder limits, we'll deal with that when we find problems.
(note: this needs to go in the codec_spec rather than finding out too late when we try to instantiate a new pipeline)

This is enough for now, will follow up in #350, etc

totaam · 2013-07-18T06:28:24Z

2013-07-18 06:28:24: antoine commented

Note: all the csc/encoder info is now exported via xpra info:

window[2].csc=swscale
window[2].csc.dst_format=YUV444P
window[2].csc.dst_height=710
window[2].csc.dst_width=1364
window[2].csc.flags=BICUBLIN | SWS_ACCURATE_RND
window[2].csc.frames=6
window[2].csc.pixels_per_second=70503532
window[2].csc.src_format=BGRX
window[2].csc.src_height=710
window[2].csc.src_width=1364
window[2].csc.total_time_ms=82
window[2].encoder=x264
window[2].encoder.frames=6
window[2].encoder.height=710
window[2].encoder.pixels_per_second=23144142
window[2].encoder.preset=medium
window[2].encoder.profile=high444
window[2].encoder.src_format=YUV444P
window[2].encoder.total_time_ms=251
window[2].encoder.width=1364

[[BR]]
And we also have the scores for figuring out what other encoding/csc options were available and how they ranked:

window[3].pipeline_option[0].csc=codec_spec(xpra.codecs.csc_swscale.colorspace_converter.ColorspaceConverter)
window[3].pipeline_option[0].encoder=codec_spec(xpra.codecs.enc_x264.encoder.Encoder)
window[3].pipeline_option[0].format=YUV444P
window[3].pipeline_option[0].score=48
window[3].pipeline_option[1].csc=codec_spec(xpra.codecs.csc_swscale.colorspace_converter.ColorspaceConverter)
window[3].pipeline_option[1].encoder=codec_spec(xpra.codecs.enc_x264.encoder.Encoder)
window[3].pipeline_option[1].format=YUV422P
window[3].pipeline_option[1].score=44
window[3].pipeline_option[2].csc=codec_spec(xpra.codecs.csc_swscale.colorspace_converter.ColorspaceConverter)
window[3].pipeline_option[2].encoder=codec_spec(xpra.codecs.enc_x264.encoder.Encoder)
window[3].pipeline_option[2].format=YUV420P
window[3].pipeline_option[2].score=26

How they are ranked will depend on the current quality and speed settings which is also available:

window[2].quality.cur=62
window[2].speed.cur=34

totaam · 2014-05-19T13:38:54Z

2014-05-19 13:38:54: totaam commented

(setting correct milestone the work was completed in)

totaam closed this as completed Jun 17, 2013

totaam mentioned this issue Jan 22, 2021

cuda csc #384

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split yuv into its own pseudo codec #328

split yuv into its own pseudo codec #328

totaam commented Apr 30, 2013

totaam commented Apr 30, 2013

totaam commented Apr 30, 2013

totaam commented May 4, 2013

totaam commented May 4, 2013

totaam commented May 4, 2013

totaam commented May 17, 2013

totaam commented May 26, 2013

totaam commented May 26, 2013

totaam commented May 26, 2013

totaam commented May 27, 2013

totaam commented May 28, 2013

totaam commented May 31, 2013

totaam commented May 31, 2013

totaam commented Jun 3, 2013

totaam commented Jun 6, 2013

totaam commented Jun 6, 2013

totaam commented Jun 6, 2013

totaam commented Jun 10, 2013

totaam commented Jun 10, 2013

totaam commented Jun 17, 2013

totaam commented Jun 17, 2013

totaam commented Jun 17, 2013

totaam commented Jul 18, 2013

totaam commented May 19, 2014

split yuv into its own pseudo codec #328

split yuv into its own pseudo codec #328

Comments

totaam commented Apr 30, 2013

2013-04-30 15:23:44: totaam created the issue

totaam commented Apr 30, 2013

2013-04-30 15:24:09: totaam uploaded file yuv-codec.patch (15.5 KiB)

totaam commented Apr 30, 2013

2013-04-30 15:39:31: ahuillet commented

totaam commented May 4, 2013

2013-05-04 14:57:20: antoine changed status from new to assigned

totaam commented May 4, 2013

2013-05-04 14:57:20: antoine changed owner from ** to antoine

totaam commented May 4, 2013

2013-05-04 14:57:20: antoine commented

totaam commented May 17, 2013

2013-05-17 14:36:24: antoine uploaded file yuv-codec-v2.patch (9.6 KiB)

totaam commented May 26, 2013

2013-05-26 10:40:44: ahuillet commented

totaam commented May 26, 2013

2013-05-26 10:58:59: ahuillet commented

totaam commented May 26, 2013

2013-05-26 16:18:58: totaam commented

totaam commented May 27, 2013

2013-05-27 12:31:28: ahuillet commented

totaam commented May 28, 2013

2013-05-28 10:09:59: totaam uploaded file yuv-codec-v3.patch (93.7 KiB)

totaam commented May 31, 2013

2013-05-31 12:28:41: totaam commented

totaam commented May 31, 2013

2013-05-31 14:13:20: ahuillet commented

totaam commented Jun 3, 2013

2013-06-03 12:55:17: totaam commented

totaam commented Jun 6, 2013

2013-06-06 16:46:02: totaam changed status from assigned to new

totaam commented Jun 6, 2013

2013-06-06 16:46:02: totaam changed owner from antoine to ahuillet

totaam commented Jun 6, 2013

2013-06-06 16:46:02: totaam commented

totaam commented Jun 10, 2013

2013-06-10 13:58:07: totaam commented

totaam commented Jun 10, 2013

2013-06-10 13:58:44: totaam uploaded file actual-video-size.patch (2.2 KiB)

totaam commented Jun 17, 2013

2013-06-17 13:22:12: totaam changed status from new to closed

totaam commented Jun 17, 2013

2013-06-17 13:22:12: totaam changed resolution from ** to fixed

totaam commented Jun 17, 2013

2013-06-17 13:22:12: totaam commented

totaam commented Jul 18, 2013

2013-07-18 06:28:24: antoine commented

totaam commented May 19, 2014

2014-05-19 13:38:54: totaam commented

2013-04-30 15:24:09: totaam uploaded file `yuv-codec.patch` (15.5 KiB)

2013-05-17 14:36:24: antoine uploaded file `yuv-codec-v2.patch` (9.6 KiB)

2013-05-28 10:09:59: totaam uploaded file `yuv-codec-v3.patch` (93.7 KiB)

2013-06-10 13:58:44: totaam uploaded file `actual-video-size.patch` (2.2 KiB)