You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There have been discussions recently regarding the desire for users who prefer to write their own SQL queries to be able to make more use of FlowMachine's caching.
It seems to me that we could go some way towards supporting this use case by making it possible to pass query objects as sub-queries of a CustomQuery.
We could do this by adding a new, optional argument to CustomQuery (e.g. dependencies) which accepts a dict of Query objects, which could be matched to placeholders in the SQL string.
Something like:
CustomQuery(
sql="""
SELECT ...
FROM {another_custom_query}
INNER JOIN {daily_location}
ON ...
""",
dependencies={
"another_custom_query": CustomQuery(...),
"daily_location": daily_location(...)
},
columns=...,
)
The dependencies could then be included in query.dependencies(), and query._make_query would return something along the lines of
self.sql.format({name: query.get_query() for name, query in dependencies.items()})
which would then make use of the cached dependencies where they exist.
It woul be good to hear from FlowMachine users / potential users as to whether this is something that would be useful in practice.
The text was updated successfully, but these errors were encountered:
Expanding on the above - custom query hashing is dependent on being able to normalise the input SQL, which we do by parsing it against the PostgreSQL parser. Obviously that's going to fail if we give it a template string instead of valid SQL, but waiting on the string to be generated is a blocking op, which we don't want to do in the __init__ - at that point, you might as well just generate the sql and pass it in to the existing class.
Maybe one could fill the gaps for hashing purposes with valid-but-meaningless SQL? e.g. sub in (select 1) as another_custom_query?
We could perhaps fill the gaps with select * from {another_custom_query.fully_qualified_table_name} for hashing (i.e. for hashing purposes, assume all sub-queries are already cached)?
There have been discussions recently regarding the desire for users who prefer to write their own SQL queries to be able to make more use of FlowMachine's caching.
It seems to me that we could go some way towards supporting this use case by making it possible to pass query objects as sub-queries of a
CustomQuery
.We could do this by adding a new, optional argument to
CustomQuery
(e.g.dependencies
) which accepts a dict of Query objects, which could be matched to placeholders in the SQL string.Something like:
The dependencies could then be included in
query.dependencies()
, andquery._make_query
would return something along the lines ofwhich would then make use of the cached dependencies where they exist.
It woul be good to hear from FlowMachine users / potential users as to whether this is something that would be useful in practice.
The text was updated successfully, but these errors were encountered: