-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sqllab] Rendering query prior to sending to Celery #4669
[sqllab] Rendering query prior to sending to Celery #4669
Conversation
I thought there was custom logic already to not require |
b3329ca
to
b19c249
Compare
@mistercrunch here's the stack trace:
|
b19c249
to
95bd382
Compare
Codecov Report
@@ Coverage Diff @@
## master #4669 +/- ##
==========================================
+ Coverage 71.87% 71.88% +<.01%
==========================================
Files 204 204
Lines 15323 15320 -3
Branches 1177 1177
==========================================
- Hits 11014 11013 -1
+ Misses 4306 4304 -2
Partials 3 3
Continue to review full report at Codecov.
|
LGTM? |
@mistercrunch even though in How do you feel about the change? Note I'm somewhat perplex how other portions of |
It's a bit hard to read the PR since Github doesn't highlight the right lines at all here. Crazy how an indent completely confuses their diffing function. But even with the Flask context manager you still won't get the user info I'm guessing as the user is not authenticated on the celery side of things. I guess it just fixes that error, but probably does not impersonate the right user deep down the call stack. There should not be an app context at that point, so my thought is we have to clean up the reliance on |
Seems like the template processor needs to be aware of the Passing it through gets pretty deep and nasty. Maybe mimicking the Flask app context isn't a bad idea, but I'm unclear on how to do that. At first I thought "would it be crazy would the celery worker to call REST endpoints?" That's probably a very bad idea. Now I'm thinking "how about doing the template rendering ahead of time in the web request and putting the rendered template in the db for the worker to run?". |
@mistercrunch I also raised the idea of pre-computing information on the Flask server which is inline with your last comment. I sense the Celery worker should be app agnostic and simply run the query. |
Agreed, it took me a moment to catch up with your thinking process. Had to go through similar reasoning. For context at the time where the decision was made to process the template on the worker, there was no impersonation going on, and it somewhat made sense to do more work on the worker. At this point I think it makes sense to render the template on the web server. It shouldn't affect query time, only pending time may appear a little longer. |
@mistercrunch how does this existing logic work , i.e., |
dea1f82
to
bd0876e
Compare
The whole It could be nice to further dissociate the flask app from the celery worker, but currently they share their configuration which depends on Flask. Is it a problem that the app context is defined though? |
bd0876e
to
6fe2ea1
Compare
@mistercrunch I've updated the logic to pass through the rendered query to the Celery task. |
It would be nice to have a unit test that would cover both |
@mistercrunch would we need to mock the Presto connection? I haven't seen any Presto specific tests. |
a30329a
to
6d031f8
Compare
Yes you're right there would be mocking involved. It might be tricky and out-of-scope for this PR here. Happy to merge as is. |
6d031f8
to
d41d747
Compare
@mistercrunch are you ok with merging this PR? A number of our users are running into this issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
(cherry picked from commit d49a0e7)
(cherry picked from commit d49a0e7)
(cherry picked from commit d49a0e7)
We're running the Celery workers via
rather than
which throws a
RuntimeError: Working outside of application context.
when trying to use templates of the form'{{ presto.latest_partition("schema.table") }}'
which doesn't have access to the Flaskg
variable which is used to determine whether the user has access to said table.Note I'm not certain what logic should reside in Celery and what should be pre-computed on the Flask server, or whether the Celery worker should even be Flask aware, but this PR simply enforces that the template processing occurs within an application context.
to: @michellethomas @mistercrunch @timifasubaa