You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, Union produces a union of two queries. Taking a union of more than two queries can be achieved by recursively applying Union (i.e. a.union(b).union(c).union(d)), but this results in a deeply-nested structure (both within the SQL query and in terms of the python Query objects), which may cause performance issues and could also result in excessive duplication of results in cache (e.g.a.union(b).union(c).union(d).store(store_dependencies=True) would produce cache tables for a, b, aUb, c, aUbUc, d and aUbUcUd.
Describe the solution you'd like
It would be useful if Union could take an arbitrary number of queries as arguments (e.g. Union(a, b, c, d, all=True) would union all four queries in one shot). This could be achieved either through arbitrary *args (which would remain compatible with the current implementation provided top and bottom are always provided as positional args and all is always explicitly provided as a kwarg) or a single list argument.
Describe alternatives you've considered
We could make Union smarter, so that Union(Union(a, b), c) would flatten the inputs into a single non-nested sequence of unions. However, this would mean pre-cached unions couldn't be used as components of larger unions, which may be something we need sometimes (e.g. if a union is already computed but we later want to extend it with rows from another query, or if we have a very large number of sub-queries and the only feasible way to union them all is to do so in stages).
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently,
Union
produces a union of two queries. Taking a union of more than two queries can be achieved by recursively applying Union (i.e.a.union(b).union(c).union(d)
), but this results in a deeply-nested structure (both within the SQL query and in terms of the pythonQuery
objects), which may cause performance issues and could also result in excessive duplication of results in cache (e.g.a.union(b).union(c).union(d).store(store_dependencies=True)
would produce cache tables fora
,b
,aUb
,c
,aUbUc
,d
andaUbUcUd
.Describe the solution you'd like
It would be useful if
Union
could take an arbitrary number of queries as arguments (e.g.Union(a, b, c, d, all=True)
would union all four queries in one shot). This could be achieved either through arbitrary*args
(which would remain compatible with the current implementation providedtop
andbottom
are always provided as positional args andall
is always explicitly provided as a kwarg) or a single list argument.Describe alternatives you've considered
We could make
Union
smarter, so that Union(Union(a, b), c) would flatten the inputs into a single non-nested sequence of unions. However, this would mean pre-cached unions couldn't be used as components of larger unions, which may be something we need sometimes (e.g. if a union is already computed but we later want to extend it with rows from another query, or if we have a very large number of sub-queries and the only feasible way to union them all is to do so in stages).The text was updated successfully, but these errors were encountered: