-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mock qa data #6465
Merged
Merged
Mock qa data #6465
Changes from all commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
0ca11ad
Initial automatic running of all ingestion tests
Thingus 537296c
Tidying up qa_check script
Thingus fd2fef9
Moving qa check script to testdata/bin
Thingus 2e248a6
Argparse in run_qa_checks
Thingus 210fc5f
Adding to ingestion scripts
Thingus b1b3c46
Adding QA check to circleci to check build
Thingus ffbccc5
Update flowdb/testdata/bin/run_qa_checks.py
Thingus 207895c
Merge branch 'master' into mock_qa_data
Thingus 774b414
Comments from review
Thingus 2b9e843
Relocking pipfile
Thingus 26a53e2
CHANGELOG.md
Thingus 8a81141
Shifting jinja to test pipenv + relocking
Thingus 7d549fd
Comments from PR (wip)
Thingus cef5b4d
Get default events tables and dates from flowdb
Thingus afc5d7d
Merge branch 'master' into mock_qa_data
Thingus a23b9f0
Moving run_qa_checks to bin + adding to dockerignoreignore
Thingus 289ce8b
Merge branch 'mock_qa_data' of https://github.com/Flowminder/FlowKit …
Thingus bf1d1e2
Update flowdb/testdata/bin/run_qa_checks.py
Thingus 91b78ae
Jinja2 and relocking for synthdata
Thingus 2cf85a5
Merge branch 'mock_qa_data' of https://github.com/Flowminder/FlowKit …
Thingus 46eb93e
Adding flowetl qa templates to flowdb test data containers
Thingus 470335d
Maybe its those commas causing trouble
Thingus ae01aaf
Now using pop and pushdir in test_data scripts
Thingus 6f074b5
Some container messing
Thingus 8349130
pushd not pushdir
Thingus 9f4b8fa
Pipenv messing
Thingus 09cac3c
Ah, the pipfile is in the root
Thingus c10739a
Lets try an explicity path
Thingus 57ca678
run_qa_checks now runs on local container
Thingus 0b265e1
More prints + popenv run
Thingus 4e837a9
cd instead of push/popd
Thingus 8bd4223
Minor fixes
Thingus 2b7c409
Explcit copy of run_qa_checks.py for synth data
greenape 9cf2611
Adding run_qa_checks.py to synth data image
Thingus 15d5928
Merge branch 'mock_qa_data' of https://github.com/Flowminder/FlowKit …
Thingus a346ddd
Removing extra COPY command
Thingus 720b4a9
Merge branch 'master' into mock_qa_data
Thingus File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
# This Source Code Form is subject to the terms of the Mozilla Public | ||
# License, v. 2.0. If a copy of the MPL was not distributed with this | ||
# file, You can obtain one at http://mozilla.org/MPL/2.0/. | ||
|
||
from dataclasses import asdict, dataclass | ||
Thingus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from datetime import date, datetime | ||
from itertools import product | ||
from pathlib import Path | ||
from typing import List | ||
from jinja2 import Environment, FileSystemLoader, Template | ||
from sqlalchemy import create_engine, text | ||
from sqlalchemy.engine import Engine | ||
import os | ||
import argparse | ||
|
||
|
||
update_template_string = """ | ||
INSERT INTO etl.post_etl_queries | ||
(cdr_date, cdr_type, type_of_query_or_check, outcome, optional_comment_or_description, timestamp) | ||
VALUES( | ||
'{{cdr_date}}', | ||
'{{cdr_type}}', | ||
'{{type_of_query_or_check}}', | ||
({{outcome_query}}), | ||
'{{optional_comment_or_description}}', | ||
'{{timestamp}}' | ||
) | ||
|
||
""" | ||
|
||
|
||
@dataclass | ||
class QaTemplate: | ||
display_name: str | ||
template: Template | ||
event_type: str | ||
|
||
|
||
@dataclass | ||
class QaRow: | ||
cdr_date: date | ||
cdr_type: str | ||
type_of_query_or_check: str | ||
outcome_query: str | ||
optional_comment_or_description: str | ||
timestamp: datetime | ||
|
||
|
||
@dataclass | ||
class MockQaScenario: | ||
dates: List[date] | ||
tables: List[str] | ||
|
||
|
||
def render_qa_check(template: Template, date: date, cdr_type: str) -> str: | ||
return template.render( | ||
final_table=f"events.{cdr_type}_{date.strftime('%Y%m%d')}", | ||
cdr_type=cdr_type, | ||
ds=date.strftime("%Y-%m-%d"), | ||
) | ||
|
||
|
||
def get_available_tables(engine: Engine): | ||
with engine.begin() as conn: | ||
tables = conn.execute( | ||
text("SELECT table_name FROM available_tables WHERE has_subscribers") | ||
) | ||
return [t[0] for t in tables.all()] | ||
|
||
|
||
def get_available_dates(engine: Engine): | ||
with engine.begin() as conn: | ||
dates = conn.execute(text("SELECT cdr_date FROM etl.available_dates")) | ||
return [d[0] for d in dates.all()] | ||
|
||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser( | ||
description="Runs all flowetl checks for ingested data" | ||
) | ||
parser.add_argument("template_path", help="Path to the QA templates") | ||
parser.add_argument( | ||
"--dates", | ||
type=lambda s: datetime.datetime.strptime(s, "%Y-%m-%d"), | ||
help="Date to run ingestion check on. Can be specified multiple times.", | ||
nargs="*", | ||
) | ||
parser.add_argument( | ||
"--event_types", help="Event tables to run qa checks on.", nargs="*" | ||
) | ||
args = parser.parse_args() | ||
env = Environment(loader=FileSystemLoader(args.template_path)) | ||
print(f"Loaded {len(env.list_templates())} templates") | ||
update_template = env.from_string(update_template_string) | ||
db_user = os.environ["POSTGRES_USER"] | ||
conn_str = f"postgresql://{db_user}@/flowdb" | ||
engine = create_engine(conn_str) | ||
print(f"Connecting to flowdb on {conn_str}.") | ||
|
||
dates = get_available_dates(engine) if not args.dates else args.dates | ||
|
||
event_types = ( | ||
get_available_tables(engine) if not args.event_types else args.event_types | ||
) | ||
|
||
qa_scn = MockQaScenario(dates=dates, tables=event_types) | ||
|
||
templates = ( | ||
QaTemplate( | ||
Path(t).name, | ||
env.get_template(t), | ||
Path(t).parent if Path(t).parent != Path(".") else "any", | ||
) | ||
for t in env.list_templates(".sql") | ||
) | ||
|
||
qa_rows = ( | ||
QaRow( | ||
date, | ||
cdr_type, | ||
template.display_name, | ||
render_qa_check(template.template, date, cdr_type), | ||
"Made from mock data", | ||
datetime.now(), | ||
) | ||
for date, cdr_type, template in product(qa_scn.dates, qa_scn.tables, templates) | ||
if template.event_type in [cdr_type, "any"] | ||
) | ||
|
||
with engine.begin() as conn: | ||
for row in qa_rows: | ||
print( | ||
f"Running {row.type_of_query_or_check} for cdr type {row.cdr_type} date {row.cdr_date}" | ||
) | ||
conn.execute(text(update_template.render(**asdict(row)))) | ||
|
||
out = conn.execute(text("SELECT * FROM etl.post_etl_queries LIMIT 10")) | ||
print(out.fetchall()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I kind of feel like this should default to on really.