Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use RLS clause instead of ID for cache key #25229

Merged
19 changes: 9 additions & 10 deletions superset/security/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -2083,28 +2083,27 @@ def get_rls_filters(self, table: "BaseDatasource") -> list[SqlaQuery]:
)
return query.all()

def get_rls_ids(self, table: "BaseDatasource") -> list[int]:
def get_rls_filter_clauses(self, table: "BaseDatasource") -> list[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we write a small unit test for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

"""
Retrieves the appropriate row level security filters IDs for the current user
Retrieves the appropriate row level security filters clauses for the current user
and the passed table.

:param table: The table to check against
:returns: A list of IDs
:returns: A list of clauses
"""
ids = [f.id for f in self.get_rls_filters(table)]
ids.sort() # Combinations rather than permutations
return ids
clauses = [f.clause for f in self.get_rls_filters(table)]
clauses.sort() # Combinations rather than permutations
return clauses
Copy link

@zephyring zephyring Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can still sort by ID but return a list of clause. Is there any case where clause can be None?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clause is not nullable at the DB level, but I think that's still a good idea since there could in theory be an edge case where there are different RLS filters with the same clause but different id's/type (base vs regular).

Updated to sort by id and concatenate the id with the clause


def get_guest_rls_filters_str(self, table: "BaseDatasource") -> list[str]:
return [f.get("clause", "") for f in self.get_guest_rls_filters(table)]

def get_rls_cache_key(self, datasource: "BaseDatasource") -> list[str]:
rls_ids = []
rls_clauses = []
if datasource.is_rls_supported:
rls_ids = self.get_rls_ids(datasource)
rls_str = [str(rls_id) for rls_id in rls_ids]
rls_clauses = self.get_rls_filter_clauses(datasource)
guest_rls = self.get_guest_rls_filters_str(datasource)
return guest_rls + rls_str
return guest_rls + rls_clauses

@staticmethod
def _get_current_epoch_time() -> float:
Expand Down