Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Delete data dump from S3 after data load has been successful #600

Merged
merged 57 commits into from
Oct 30, 2024

Conversation

antroy-madetech
Copy link
Contributor

@antroy-madetech antroy-madetech commented Oct 17, 2024

Addresses DBTP-1109

  • Delete data dump from S3 after data load has been successful
  • Delete the old database-copy Docker files
  • Move the new Docker files from database-copy2 into database-copy
  • Added a database copy command that rolls the dump and load commands together

Checklist:

Title:

  • Scope included as per conventional commits
  • Ticket reference included (unless it's a quick out of ticket thing)

Description:

  • Link to ticket included (unless it's a quick out of ticket thing)
  • Includes tests (or an explanation for why it doesn't)
  • If the work includes user interface changes, before and after screenshots included in description
  • Includes any applicable changes to the documentation in this code base
  • Includes link(s) to any applicable changes to the documentation in the DBT Platform Documentation (can be to a pull request)

Tasks:

class MaintenancePageProvider:
def __init__(self):
self.offline = offline_command
self.online = online_command


class DatabaseCopy:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about moving the classes etc. into a new src or similar directory and starting to tidily namespace it all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had similar thoughts. "commands" module will be where the thin click wrappers will live. "domain" module for the business logic and "providers" module for the common code that connects with things outside the domain logic (most of the "utils" module probably belongs in here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'll write this up in the confluence doc for people to comment on)

] = MaintenancePageProvider(),
input_fn: Callable[[str], str] = click.prompt,
echo_fn: Callable[[str], str] = click.secho,
abort_fn: Callable[[str], None] = abort_with_error,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really looked at this thoroughly, but this feels like a lot of things going into the constructor.

It makes me wonder whether the DatabaseCopy class is doing a bit much. Maybe we can separate out the copying work from the input/output/side effects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some of this stuff can be tidied up further. For example the input/echo/abort are all UI elements that can probably be moved into a single UserInterfaceProvider or something. Perhaps in a tidy up step after we get this command out for FFT to use.


- The application name. Required unless you are running the command from your deploy repo
- `--from <text>`
- This is required unless you are running the command from your deploy repo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? Looks from line 994 that --from is mandatory? Description should probably be:
The environment you are dumping data from

)
raise click.Abort
maintenance_page = MaintenancePageProvider()
maintenance_page.offline(app, env, svc, template, vpc)
Copy link
Contributor

@ksugden ksugden Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this code sounds a bit like the maintenance page is what is going offline. It's probably going to confuse people. Maybe we could improve naming:

    maintenance_page_provider.set_application_offline(app, env, svc, template, vpc)

or maybe:

    maintenance_page.activate(app, env, svc, template, vpc)

vpc_name = config.get("environments", {}).get(env, {}).get("vpc")
return vpc_name

def run_database_copy_task(
Copy link
Contributor

@ksugden ksugden Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts for later - we should have an EcsTaskService for doing this sort of thing

clean_up

echo "Stopping data load"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bash script is quite unwieldy. Could we separate it into scripts for the different 'steps'? Maybe should consider something like https://aws.amazon.com/step-functions/features/?pg=ln&sec=hs in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll look at this in the tidy up step.

@@ -1,11 +0,0 @@
FROM postgres:16
Copy link
Contributor

@ksugden ksugden Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to clean up this directory?

@antroy-madetech antroy-madetech merged commit 410cd56 into main Oct 30, 2024
9 checks passed
@antroy-madetech antroy-madetech deleted the DBTP-1109-DataCopy_2 branch October 30, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants