-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve GT.save()
usability
#499
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #499 +/- ##
==========================================
+ Coverage 87.86% 89.38% +1.52%
==========================================
Files 42 45 +3
Lines 4852 5239 +387
==========================================
+ Hits 4263 4683 +420
+ Misses 589 556 -33 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
Hello team, In my last commit, I experimented with isolating the web driver preparation logic into |
great_tables/_utils_selenium.py
Outdated
self.debug_port = debug_port | ||
self.wd_options = webdriver.ChromeOptions() | ||
super().__init__() | ||
self.driver = webdriver.Chrome(self.wd_options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we consolidate the driver initialization across all these subclasses, by adding cls_wd_options=ChromeOptions
, and cls_driver=Chrome
as class attributes?
That way, the BaseClass could essentially require subclasses to specify those pieces. Then, the init would not have to be overridden by child classes.
This makes the child classes mostly about:
- specifying Driver and Option classes (as class attributes)
- customizing add_arguments method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@machow , thanks for the quick response.
Using cls_driver
seems like a good idea, but I’m concerned that cls_wd_options
might lead to issues. For example, what if we need to create two Chrome
instances with different ChromeOptions
in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@machow , I think I understand now. Does the latest commit align with what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thanks -- that looks really nice!
This is looking good! I had a look at the output images on all 4 webdrivers (at 3 different scale values) to make sure outputs are similar to I've found no reduction in quality with the changes here. Chrome is the only one that is problematic (cuts off content at the bottom) but that's also on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is so helpful! Thanks for taking the time to work on .save(), which afaict is one of the most important parts of our API. My one request was that we remove **params
so that PIL is a internal detail.
Excited to merge all these PRs, sorry for slowing your roll 😭
great_tables/_export.py
Outdated
_debug_dump | ||
Whether the saved image should be a big browser window, with key elements outlined. This is | ||
helpful for debugging this function's resizing, cropping heuristics. This is an internal | ||
parameter and subject to change. | ||
**params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay if we remove the **params
piece of this PR? I'm onboard with everything else. Because **params
only applies to when we go from a .png
-> something else, it exposes PIL as part of the user API.
If we leave it out, we can swap out PIL down the road. As an alternative could we tell users that we're using Pillow? We could even tell them about PIL open and save if needed (so they can go from .png -> anything PIL supports on their own if more customization is needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I agree that we should remove **params
, making Pillow
an internal dependency rather than a public one.
Regarding hints, I believe that diligent readers likely already recognize we use Pillow
under the hood and would consult its documentation for further customization if needed.
This aligns with our goal of providing a user-friendly way to save generated tables rather than focusing on creating highly customized table figures in various formats🤔.
As a side note, this PR does not address the cutoff bug encountered when using Chrome, as mentioned in #480 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Shoot, I just noticed the change to passing a base64 encoded url directly to the headless browser. I think there could be some challenges with this approach (and url length limitations). In general, modern browsers seem to have large limits, but it's a tricky territory. Opened an issue to track: |
Related PR: #496.
This PR introduces several improvements to enhance the usability of
GT.save()
:**params
to allow advanced users to customize save parameters directly via Pillow..png
extension check to a single line.FmtImage._get_image_uri()
, without usingtempfile.TemporaryDirectory()
.time.sleep(0.05)
withWebDriverWait(driver, 1).until()
(we may need to determine the optimal timeout for most use cases)..png
saving branch that relied onselenium
; all files are now saved usingPillow
.GT.save()
to return itself, allowing users to save intermediate tables. For example:Additionally, I noticed a potential speed boost (approximately 40-50% faster on my machine) when calling
headless_browser=wdriver(options=wd_options)
directly, bypassing the context manager. However, I'm unsure about the safety implications of this approach.