Create views for de-duping ETL record tables #5641
Labels
enhancement
New feature or request
FlowDB
Issues related to FlowDB
FlowETL
FlowMachine
Issues related to FlowMachine
FlowETL appends a record to
etl.etl_records
each time a new day of data is ingested. Flowmachine uses this table to determine which dates are available.If a day of data is dropped from flowdb, the corresponding ETL record must be deleted or overwritten otherwise flowmachine will still report the data as available. Since
etl.etl_records
has a 'timestamp' column it would be easy enough to append a new row each time the ingest state changes, and fetch the most recent record per (cdr_type, cdr_date) to determine available dates. This could be done with a view.Similarly, QA results will be written to
etl_post_etl_queries
each time a day is ingested. If a day of data is dropped and ingested multiple times, there will be multiple post-ETL query results for that date and we're generally only interested in the most recent (and only for dates that are currently available). This could also be simplified by a view.The text was updated successfully, but these errors were encountered: