Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create views for de-duping ETL record tables #5641

Closed
jc-harrison opened this issue Nov 24, 2022 · 0 comments · Fixed by #5642
Closed

Create views for de-duping ETL record tables #5641

jc-harrison opened this issue Nov 24, 2022 · 0 comments · Fixed by #5642
Labels
enhancement New feature or request FlowDB Issues related to FlowDB FlowETL FlowMachine Issues related to FlowMachine

Comments

@jc-harrison
Copy link
Member

FlowETL appends a record to etl.etl_records each time a new day of data is ingested. Flowmachine uses this table to determine which dates are available.

If a day of data is dropped from flowdb, the corresponding ETL record must be deleted or overwritten otherwise flowmachine will still report the data as available. Since etl.etl_records has a 'timestamp' column it would be easy enough to append a new row each time the ingest state changes, and fetch the most recent record per (cdr_type, cdr_date) to determine available dates. This could be done with a view.

Similarly, QA results will be written to etl_post_etl_queries each time a day is ingested. If a day of data is dropped and ingested multiple times, there will be multiple post-ETL query results for that date and we're generally only interested in the most recent (and only for dates that are currently available). This could also be simplified by a view.

@jc-harrison jc-harrison added enhancement New feature or request FlowMachine Issues related to FlowMachine FlowDB Issues related to FlowDB FlowETL labels Nov 24, 2022
@jc-harrison jc-harrison mentioned this issue Nov 24, 2022
8 tasks
@mergify mergify bot closed this as completed in #5642 Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request FlowDB Issues related to FlowDB FlowETL FlowMachine Issues related to FlowMachine
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant