-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement cache invalidation api #10761
Conversation
superset/cache/api.py
Outdated
return self.response_400(message="Request is incorrect") | ||
except ValidationError as error: | ||
return self.response_400( | ||
message=_("Request is incorrect: %(error)s", error=error.messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Just use: return self.response_400(message=error.messages)
210cffd
to
2028348
Compare
Codecov Report
@@ Coverage Diff @@
## master #10761 +/- ##
==========================================
- Coverage 61.32% 60.99% -0.33%
==========================================
Files 803 806 +3
Lines 37927 38002 +75
Branches 3561 3561
==========================================
- Hits 23258 23180 -78
- Misses 14483 14636 +153
Partials 186 186
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
942ede8
to
f119c85
Compare
@@ -76,6 +76,14 @@ def GET_FEATURE_FLAGS_FUNC(ff): | |||
REDIS_PORT = os.environ.get("REDIS_PORT", "6379") | |||
REDIS_CELERY_DB = os.environ.get("REDIS_CELERY_DB", 2) | |||
REDIS_RESULTS_DB = os.environ.get("REDIS_RESULTS_DB", 3) | |||
REDIS_CACHE_DB = os.environ.get("REDIS_CACHE_DB", 4) | |||
|
|||
CACHE_CONFIG = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed for cache to work on CI
@willbarrett , @dpgaspar - ready for review |
@@ -0,0 +1,98 @@ | |||
# Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the route /api/v1/cachekey/invalidate
is not ideal. I'm struggling to find something good though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, I had the same struggles, definitely open to suggestions.
/api/v1/cache/invalidate could be slightly better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about DELETE /api/v1/cachekey/<cachekey>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be too much struggle to fetch cache keys though the API and then issue deletes for those.
I think we could expose GET / DELETE later once those use cases will arise
@statsd_metrics | ||
def invalidate(self) -> Response: | ||
""" | ||
Takes a list of datasources, finds the associated cache records and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note (outside the scope of this PR), we'll be renaming datasource to dataset. I think it's mostly done in the UI layer, but has yet to be done in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I can do it in the separate PR if that works for you, would rename it in API & CacheKey model for consistency.
4bfc2d3
to
49e0eea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of comments
superset/cachekeys/api.py
Outdated
|
||
class CacheRestApi(BaseSupersetModelRestApi): | ||
datamodel = SQLAInterface(CacheKey) | ||
resource_name = "cache" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this was renamed to cache, evaluate if placing everything on cachekeys
still makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to cachekey, I am happy with either way naming it cachekey
or cache
I was thinking about moving it into separate class from the cache key, but most of the instrumentation lives in BaseSupersetModelRestApi that requires to model.
superset/cachekeys/api.py
Outdated
.all() | ||
) | ||
if cache_keys: | ||
cache_manager.cache.delete_many(*[c.cache_key for c in cache_keys]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions:
- if a cache
delete_many
fails, we may end up with delete keys on the cache that still exist onCacheKey
will a next run ofdelete_many
fail if it can't find a key? - the same applies if a delete
CacheKey
fails, would also be safer to try/except and rollback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete_many doesn't fail, only returns true / false if the keys were deleted. e.g. if the key was not found it will return false. Updated CacheKey logic to delete is a single statement & rollback if fails -> that would be more robust.
deleting case & keeping records is not too bad as there still a need to garbage collect those and cache is not a persistent store.
@@ -69,6 +71,39 @@ def get_resp( | |||
return resp.data.decode("utf-8") | |||
|
|||
|
|||
def post_assert_metric( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same but on a different place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes same code, moved it into this function for reuse in the pytests
115de31
to
fc4f9d2
Compare
682007a
to
ecb9672
Compare
ecb9672
to
b87299f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want all the ModelRestApi
route methods to be available? If not include include_route_methods = {"invalidate",}
also note this will create a new specific permission can invalidate on Cache key
.
Check https://localhost:8080/swagger/v1
addressed, thanks |
* Add cache endpoints * Implement cache endpoint * Tests and address feedback * Set cache config * Address feedback * Expose only invalidate endpoint Co-authored-by: bogdan kyryliuk <[email protected]>
Roadmap: apache-superset/superset-roadmap#74 |
|
||
@expose("/invalidate", methods=["POST"]) | ||
@event_logger.log_this | ||
@protect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this API is not actually protected---it's probably missing a @permission_name("edit")
. Anyone can trigger cache key invalidation with a cURL:
curl -X POST --data '{"datasource_uids": ["3__table"]}' --header "Content-Type: application/json" "http://localhost:8088/api/v1/cachekey/invalidate"
Is this by design?
cc @bkyryliuk
SUMMARY
Implements a way to invalidate cached data for the provide datasources.
TEST PLAN
ADDITIONAL INFORMATION