-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sqitch revert takes very long (minutes) on individual changes with a high number of dependencies on Snowflake #571
Comments
Yikes! That's no fun. Looks like that query gets called for each change to be reverted, to make sure no other changes depend on it: sqitch/lib/App/Sqitch/Engine.pm Lines 626 to 645 in a2c9205
The query is defined by the If someone could rewrite the query to return the proper results for a list of changes, so it could be called just once per revert, I think that'd make Snowflake reverts a lot less time-consuming. Better yet rewrite the default query, which uses a correlated subquery, which wasn't supported by Snowflake at the time I added Snowflake support to Sqitch, but maybe does now? That query is: sqitch/lib/App/Sqitch/Role/DBIEngine.pm Lines 617 to 633 in a2c9205
|
I've just executed the correlated subquery on a Snowflake account and it still does not work, but how about the following? SELECT C.CHANGE_ID
, C.PROJECT
, C.CHANGE
, T.TAG AS ASOF_TAG
FROM SQITCH.DEPENDENCIES D
INNER JOIN SQITCH.CHANGES C
ON C.CHANGE_ID = D.CHANGE_ID
LEFT JOIN (
SELECT CHANGES.PROJECT
, CHANGES.COMMITTED_AT
, TAG
, ROW_NUMBER() OVER (PARTITION BY CHANGES.PROJECT ORDER BY CHANGES.COMMITTED_AT) AS RNK
FROM SQITCH.CHANGES
JOIN SQITCH.TAGS
ON CHANGES.CHANGE_ID = TAGS.CHANGE_ID
) T
ON C.PROJECT = T.PROJECT
AND T.COMMITTED_AT >= C.COMMITTED_AT
WHERE (T.RNK IS NULL OR T.RNK = 1) |
That still would be executed multiple times, though, right? Ideally, we could design a single query to take an array of change IDs, and return their dependencies — but only if the dependent IDs would not be reverted first. For example, say we have this plan:
Say they are all deployed, and we want to revert to
Before we do, we need to make sure that none have dependencies that would be broken by reverting. If we just query naively, we would see that So the trick is to have a way to get back the list of dependencies only if they would not be removed by an earlier reversion, all before we do any reverting. One query to do that is surely possible, but I punted on figuring it out when I wrote Sqitch, as it wasn't really an issue until Snowflake, where queries are so expensive. Perhaps now is the time to figure it out. Do you have suggestions, @dlawrences? |
Hi team
I am in the process of finalising a CI/CD implementation of Sqitch over Snowflake and I have observed that the process of reverting changes with a high number of dependencies (direct and indirect) takes very long (2-3 minutes per change if not more).
I've experienced this with, for example, the following dependency chain:
It seems Sqitch will issue the following individual SQL statement for any specific dependency in the dependency chain:
Now, given SnowSQL requests actually take that long, this is turning quickly into a "snowball" effect. I've seen the statement above being executed 50-100 times per one change being reverted. Is there anything that can be done to batch up that query for multiple dependencies at once?
Thanks
The text was updated successfully, but these errors were encountered: