Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DENG-1705 - Add missing client attribution columns to clients daily/first-seen #4505
DENG-1705 - Add missing client attribution columns to clients daily/first-seen #4505
Changes from all commits
16a2ba8
e424604
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case for this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say, this is used downstream in
clients_first_seen_v2
cc @lucia-vargas-aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice a different naming convention e.g.
environment.partner.partner_id
is namedpartner_id
, whileenvironment.build.xpcom_abi
is namedenv_build_xpcom_abi
which might cause confusion to users and also differs from the naming in v2. It makes sense to me to keep the naming without the suffix for consistency e.g.xpcom_abi
instead ofenv_build_xpcom_abi
, wdyt?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I matched the convention for the existing
environment.build
columns inclients_daily
which all prefix withenv_build_
see https://github.com/mozilla/bigquery-etl/pull/4505/files#diff-0802d82f91d4f1ab2d91e8d0d1ca4062467a1b723cee0b293eab62248966b949R242-R245. Since we're not likely to change those upstream columnsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some prefix and some don't e.g. partner_id, is_wow64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are from
environment.partner
andenvironment.system
notenvironment.build
so doesn't seem like there was a previous conventionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this leads to an unnecessary complication of needing to update downstream queries when cascading the changes due to different naming in v1 and v2. It'd be so much better if we align the naming between them in this PR rather than updating the schema or expanding the queries later, see e.g. platfform_version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow. The schemas for
clients_first_seen_v2
andclients_first_seen_v1
are already very different. Keeping the convention makes sense for the same reason we wouldn't change the upstreamenv_build_arch
,default_search_engine_data_load_path
, orgeo_subdivision1
to match the downstream column names.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same case of env_build_platform_version about naming convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about naming, which also differs from v2. It might be confusing for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about naming as above.