-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add plan command to the CLI #31
Conversation
…ure/add-plan-command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@b-per This pull request definitely does what it says on the tin. Having that been said, I'm not 100% in love with the architecture leveraged to implement the sync vs plan logic.
Currently, each API operation is co-located, with a basic condition against no_update
being checked to determine if the operation will be executed or planned. This works, but has the drawback that it's relatively brittle -- we can relatively easily introduce a defect that will result in an operation being executed instead of logged. For example, adding a new operation and forgetting to add the no_update
check!
I wonder what other architectures we could use here? One option that immediately comes to mind is create a "ChangeSet" data structure, which tracks which objects will be operated on, and the operations we will be performing. For example, creating a job, updating an environment_id, deleting an environment variable, etc. If we have a ChangeSet
encapsulate our set of operations, then plan would generate and report on that ChangeSet
, whereas sync
would apply that ChangeSet
against dbt Cloud.
I'm sure there are many other valid and safe approaches available, too!
@b-per, are there any other necessary and sufficient approaches that you'd like to experiment with? Alternatively, am I overthinking this existing approach?
I completely agree. This is super brittle and ugly 🤣 but it got the functionality out the door I like the idea of a I'll give it a go today! |
I have some code working at 90% but realized that this changeset brings one difficulty to the mix 😄 The logic is:
This flow breaks when we add a new job which has env var overwrite, because in that case, we collect an action to create the job, but the job is not created at that time. So we don't collect an action to check the env vars on this job. Right now, if we create a new job, after one I am going to think of an idea of how we can manage that but let me know if something comes to your mind directly on how to fix it. (it was not a problem with the old approach because we were executing each step one by one, so when checking the env var, the job was already created) |
Easy fix would be to trigger twice in a row the |
This class stores the changes identified in 2 ways: - a descriptive way, with an action/type/id - a programmatic way, on how it should be handled (func and params)
The logic becomes a bit more complex to handle the case where we setup env var overwrites in a new job that doesn't exist in Cloud yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@b-per This is a significantly more maintainable approach! Thank you for making these changes. Light testing suggests that this is working as expected ✅
🚢
def __str__(self): | ||
return f"{self.action.upper()} {string.capwords(self.type)} {self.identifier}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's room in the future for this to be a little more detailed (e.g. what was the value before, and what is it now?). This is a great start 👍🏻
identifier=identifier, | ||
type="job", | ||
action="update", | ||
sync_function=dbt_cloud.update_job, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really clever approach! I appreciate how configurable this callback function approach will be 🚀
Fixes #29
First idea on how we could tackle
plan
.Not a big fan of the UX around it and the function could be cleaner but I wanted to avoid copy/pasting most of the
sync
command.Here is the output of
python src/main.py plan <myfile>
I don't feel like having the
plan
output with the rest of the logs make it stand out enough.We should maybe collect the changes and show them all at the end of the command?