Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move bits to native code? #670

Closed
xmo-odoo opened this issue Aug 14, 2018 · 5 comments
Closed

Move bits to native code? #670

xmo-odoo opened this issue Aug 14, 2018 · 5 comments
Labels
performance Too slow renderings

Comments

@xmo-odoo
Copy link

Colleagues have been looking at weasyprint as a possible replacement for wkhtml as it's definitely more approachable & debuggable (than a huge complex pile of C++ especially for people with limited knowledge of C++ but good knowledge of Python), seems well-built and can be both used as a library and shelled out. And is less likely to segfault and regress than wk.

The sticking point so far is performances, as known from various issues on the tracker. Now from a test report the only bit which stands out would be get_all_computed_style which is 40% according to cprofile or 2025% according to flamegraph (the report is not the most complex but is pretty large and has a lot of non-specific CSS to deal with) with the rest being spent in a large and slowly tapering crenelated tower of layout things (an other hotspot of sorts is iter_lines but that's likely in large part because it's called a lot and ultimately does quite a bit of work).

And so I've been wondering if maybe subsystems should be moved to native code entirely, and if that sounds fine which tech you'd prefer (e.g. cython, python extension, move more code on the other side of CFFI, and in the latter two cases whether the native bits should rather be in C or Rust or something else).

@Tontyna
Copy link
Contributor

Tontyna commented Aug 16, 2018

Only recently started learning how Python packages are deployed, I'm wondering what you mean by 'native code'? Library files, aka .pyd/.dll/.so? Would than mean that everybody who wants to contribute to WeasyPrint had to setup the appropriate SDK (s)?

Wondering also which parts of WeasyPrint could be made native and whether the nativeness would actually improve performance -- as you pointed out lot of time is spent with layout things (slowly tapering crenelated tower, wonderful metaphor ❤️ ). IMHO optimization of the layout code should be the first option before trying to rewrite (parts of) it in another language...

@liZe liZe added the performance Too slow renderings label Aug 16, 2018
@xmo-odoo
Copy link
Author

Only recently started learning how Python packages are deployed, I'm wondering what you mean by 'native code'? Library files, aka .pyd/.dll/.so?

Yes.

Would than mean that everybody who wants to contribute to WeasyPrint had to setup the appropriate SDK (s)?

Depends whether the Python code remains as fallback or not.

IMHO optimization of the layout code should be the first option

I agree but according to the PRs/issues there have already been a number of these (e.g. #384, #638), and while I admittedly have almost no knowledge of the codebase aside from get_all_computed_style nothing really sticks out from the profiles of my specific test:

cProfile graph

weasy

flame graph

weasy

(you may want to click the image to open them at their full size)

@Tontyna
Copy link
Contributor

Tontyna commented Aug 18, 2018

Native code ist faster. But...

I confess: I dont like the idea very much, only as a last resort after cleaning up and optimizing the Python code.

Of course it's a decision @liZe has to make, though I suppose he isn't keen on distributing platform-specific wheels instead of a single none-any.whl.

If there was a couple of contibutors, being able and willing to write and maintain native code, this was maybe an option ... if the queen had balls, she'd be king.

@liZe
Copy link
Member

liZe commented Sep 5, 2018

It's a complex question.

while I admittedly have almost no knowledge of the codebase aside from get_all_computed_style nothing really sticks out from the profiles of my specific test:

get_all_computed_style is slow, but profiling results are really different depending on the documents you're converting. Of course, increasing its speed will help everybody, but it is really important for your needs, not for everybody's needs.

And so I've been wondering if maybe subsystems should be moved to native code entirely, and if that sounds fine which tech you'd prefer

Using a fast language may look as a good idea, but downsides are really important:

  • platform-specific wheels need a lot of work at the beginning and some work for each release,
  • installation problems will appear on non-automatically tested platforms (yes, I think of Windows),
  • developers and maintainers will need to know 2 languages,
  • code is complex with a very high-level language, it will become even more complex (at least bigger) with C / Rust / whatever…

There's a lot of work that can be done to improve WeasyPrint's speed. Using a faster language is an option, but I think that there's a lot to do before.

A golden rule I've learned about optimization is: "Don't try to make a function execute faster, try not to call it". WeasyPrint is the perfect project to follow this rule, because its code is often naive and does too many things that could be avoided. You're right, iter_lines is for example a good starting point as lines are often rendered many times during the layout when a smart small cache could avoid this.

@xmo-odoo
Copy link
Author

xmo-odoo commented Sep 5, 2018

Fair enough.

@xmo-odoo xmo-odoo closed this as completed Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Too slow renderings
Projects
None yet
Development

No branches or pull requests

3 participants