Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlowETL 'count_duplicates' QA check doesn't count all duplicates #2651

Closed
jc-harrison opened this issue Jun 3, 2020 · 0 comments · Fixed by #2652
Closed

FlowETL 'count_duplicates' QA check doesn't count all duplicates #2651

jc-harrison opened this issue Jun 3, 2020 · 0 comments · Fixed by #2652
Labels
bug Something isn't working FlowETL

Comments

@jc-harrison
Copy link
Member

The 'count_duplicates' QA check in FlowETL only counts duplicates for rows with count(*)-1 > 1 (i.e. at least 3 identical rows). This means duplicates of rows only duplicated once aren't included, which can result in the value for count_duplicates being less than that for count_duplicated.

@jc-harrison jc-harrison added bug Something isn't working FlowETL labels Jun 3, 2020
@mergify mergify bot closed this as completed in #2652 Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working FlowETL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant