-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Add columns to decennial census and ACS #229
Conversation
docs/source/datasets/index.rst
Outdated
* - Relationship to reference person | ||
- :code:`relationship_to_reference_person` | ||
- :code:`relationship_to_reference_person` | ||
- Possible values for this indicator include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the "relationship to reference person" column for the decennial census is missing some possible values, namely: "Opposite-sex spouse"; "Opposite-sex unmarried partner"; "Same-sex spouse"; "Same-sex unmarried partner"
I did not update these values in this pull request in case we want to make this correction as a hotfix instead. So if this PR gets merged to develop
, the documentation for census and ACS will not mach until we implement the hotfix and merge it to develop
. Alternatively, I could just make the change here, and we could not worry about making the correction in a hotfix. @aflaxman any preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep it simple, just make the changes here.
institutional". The types of noninstitutional group quarters are | ||
"College", "Military", and "Other noninstitutional". | ||
* - Relationship to reference person | ||
- :code:`relationship_to_reference_person` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on slack, our data column in the decennial census is called relation_to_reference_person
instead. Should I change this to match, or should we change the data to match what's in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards relationship_to_reference_person
at this moment.
docs/source/datasets/index.rst
Outdated
"Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried | ||
partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological | ||
child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; | ||
"Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate"; "Foster | ||
child"; "Other nonrelative"; "Institutionalized group quarters | ||
population"; and "Noninstitutionalized group quarters population". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the full category names such as "Opposite-sex unmarried partner" and "Noninstitutionalized group quarters population" instead of our abbreviations such as "Opp-sex partner" and "Noninstitutionalized GQ pop". I think that's better -- do other people have a preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, since the strings are long anyway we might as well have them unabbreviated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're changing the "GQ pop" ones anyway, did we decide whether we like the word "population"? I feel like "group quarters person" is more natural for this column but don't feel strongly.
docs/source/datasets/index.rst
Outdated
"Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried | ||
partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological | ||
child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; | ||
"Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate"; "Foster | ||
child"; "Other nonrelative"; "Institutionalized group quarters | ||
population"; and "Noninstitutionalized group quarters population". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're changing the "GQ pop" ones anyway, did we decide whether we like the word "population"? I feel like "group quarters person" is more natural for this column but don't feel strongly.
I got the names from pp. 39-40 in this reference, with the only change being to remove the gendered language: If we wanted to make other changes I guess we could, but I figured the simplest thing was to use what's there. |
Yes, that makes sense. More realistic to how the real data is, too. |
@@ -6,10 +6,10 @@ Datasets | |||
|
|||
Here we cover the realistic simulated datasets, which are analogous to "real world" administrative records such as tax documents | |||
and routinely generated files of social security numbers, that users can generate using Pseudopeople for developing and testing Entity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and routinely generated files of social security numbers, that users can generate using Pseudopeople for developing and testing Entity | |
and routinely generated files of social security numbers, that users can generate using pseudopeople for developing and testing Entity |
I just realized the capitalization for 'pseudopeople' on this page is inconsistent with the rest of documentation - it should all be lower-case right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks, pseudopeople should be lower case. I think there are several minor edits like this that we should do to improve the documentation, but I haven't been prioritizing them since we're still working on getting the main content in line with what the engineers are doing. For the sake of limiting the scope of each PR, I think I'm not going to make this change in this PR, but I'll create a JIRA ticket to take a pass at editing the Datasets page. Note that there are a few more edits to make on this page, e.g., at least one more instance of capitalized "Pseudopeople", and "Entity Resolution" should also be lower case.
Co-authored-by: Zeb Burke-Conte <[email protected]>
Co-authored-by: Zeb Burke-Conte <[email protected]>
Mic 4255/relationship category name updates Updates categories of relationship to reference person among household members. - *Category*: Data - *JIRA issue*: [MIC-4255](https://jira.ihme.washington.edu/browse/MIC-4255) - *Research reference*: ihmeuw/pseudopeople#229 Changes and notes -updates relationshiop to reference person category names. Makes it so abbreviated names are no long abbreviated. Verification and Testing Successfully rebuilt data key in artifact. Ran simulation and make_results.
DOC: Add columns to decennial census and ACS
Description
This is intended to be the official documentation for the engineers to implement these changes, as well as being the user-facing documentation.
The diff includes a bunch of whitespace deletions from my editor, so you may want to hide those.
Testing
Built docs locally