Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auxiliary table to record results of regular queries (e.g. run as part of daily ingestion) #988

Closed
maxalbert opened this issue Jun 27, 2019 · 0 comments · Fixed by #993
Labels
enhancement New feature or request FlowDB Issues related to FlowDB

Comments

@maxalbert
Copy link
Contributor

maxalbert commented Jun 27, 2019

As part of the daily ingestion process one typically wants to run some initial quality checks or calculate informative metrics. Simple examples: how many call events happened on that date? Are there any cell_id values in this date's CDR data which are not present in the infrastructure.cells table (and if so how many)? Etc.

In the longer run this will be supported by a set of supplementary tables (to be added as part of #640) that record the results of standard queries (or standard pre-aggregations) which are run as part of the ingestion process.

In the immediate short term it will be useful to have a single auxiliary table where the results of semi-ad-hoc queries can be recorded in a slightly less structured and more informal way but which still allows quick querying of results. Suggestion for the structure of this table:

|   cdr_date | cdr_type | type_of_query_or_check |  outcome | optional_comment_or_description |
|------------+----------+------------------------+----------+---------------------------------|
| 2019-01-01 | calls    | num_total_calls        | 42000000 | NULL                            |
| 2019-01-01 | calls    | num_missing_cell_ids   |     4242 | needs investigation             |

Such a table should be added to flowdb, e.g. as etl.daily_queries (any suggestions for a better name?).

@maxalbert maxalbert added enhancement New feature or request FlowDB Issues related to FlowDB labels Jun 27, 2019
maxalbert pushed a commit that referenced this issue Jun 27, 2019
…that are run as part of the regular ETL process (#988).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request FlowDB Issues related to FlowDB
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant