From 17c60d97c4a5a84f70ada221a51e6e300608a20b Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Wed, 19 Jul 2023 14:21:18 -0700 Subject: [PATCH 01/11] let Atom remove whitespace --- docs/source/datasets/index.rst | 292 ++++++++++++++++----------------- 1 file changed, 146 insertions(+), 146 deletions(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index 0659cf2d..a7b5464c 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -6,10 +6,10 @@ Datasets Here we cover the realistic simulated datasets, which are analogous to "real world" administrative records such as tax documents and routinely generated files of social security numbers, that users can generate using Pseudopeople for developing and testing Entity -Resolution algorithms and software. +Resolution algorithms and software. -Each of the datasets that can be generated using Pseudopeople have "noise" added to them, thereby realistically -simulating how administrative records can be corrupted or distorted, which creates challenges in linking those +Each of the datasets that can be generated using Pseudopeople have "noise" added to them, thereby realistically +simulating how administrative records can be corrupted or distorted, which creates challenges in linking those records. To read more about the different kinds of noise that can be applied to the different datasets, please see the :ref:`Noise page `. @@ -30,7 +30,7 @@ US Decennial Census ------------------- The Decennial Census dataset is a simulated enumeration of the US Census Bureau's Decennial Census of Population and Housing. To find out more about the Decennial Census, please visit the Decennial Census -`homepage `_. +`homepage `_. It is only possible to generate Decennial Census data for decennial years -- 2020, 2030, and 2040. @@ -43,60 +43,60 @@ The following columns are included in this dataset: * - Attribute Name - Column Name - - Notes + - Notes * - Unique simulant ID - :code:`simulant_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation. * - Unique household ID - :code:`household_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation; consistent across all - datasets. + datasets. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Age - - :code:`age` - - Rounded down to an integer. + - :code:`age` + - Rounded down to an integer. * - Date of birth - :code:`date_of_birth` - Formatted as YYYY-MM-DD. * - Physical address street number - :code:`street_number` - - + - * - Physical address street name - :code:`street_name` - - + - * - Physical address unit number - :code:`unit_number` - - + - * - Physical address city - - :code:`city` - - + - :code:`city` + - * - Physical address state - - :code:`state` - - + - :code:`state` + - * - Physical address ZIP code - :code:`zipcode` - - + - * - Relationship to reference person - - :code:`relationship_to_reference_person` + - :code:`relationship_to_reference_person` - Possible values for this indicator include: Reference person; Biological child; Adopted child; Stepchild; Sibling; Parent; Grandchild; Parent-in-law; Child-in-law; Other relative; Roommate; Foster child; Other nonrelative; Noninstitutionalized GQ pop; and Institutionalized GQ pop. - * - Sex - - :code:`sex` + * - Sex + - :code:`sex` - Binary; "male" or "female". * - Race/ethnicity - - :code:`race_ethnicity` + - :code:`race_ethnicity` - The exhaustive and mutually exclusive categories for the single composite "race/ethnicity" indicator are as follows: White; Black; Latino; American Indian and Alaskan Native (AIAN); Asian; Native Hawaiian and Other Pacific Islander (NHOPI); and - Multiracial or Some Other Race. + Multiracial or Some Other Race. American Community Survey (ACS) ------------------------------- @@ -121,59 +121,59 @@ The following columns are included in this dataset: - Notes * - Unique simulant ID - :code:`simulant_id` - - Not affected by noise functions; intended use is "ground truth" for testing and validation. + - Not affected by noise functions; intended use is "ground truth" for testing and validation. * - Unique household ID - :code:`household_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation; consistent across all datasets. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Age - - :code:`age` + - :code:`age` - Rounded down to an integer. * - Date of birth - :code:`date_of_birth` - Formatted as YYYY-MM-DD. * - Physical address street number - :code:`street_number` - - + - * - Physical address street name - :code:`street_name` - - + - * - Physical address unit number - :code:`unit_number` - - + - * - Physical address city - - :code:`city` - - + - :code:`city` + - * - Physical address state - - :code:`state` - - + - :code:`state` + - * - Physical address ZIP code - :code:`zipcode` - - - * - Sex - - :code:`sex` + - + * - Sex + - :code:`sex` - Binary; "male" or "female" * - Race/ethnicity - - :code:`race_ethnicity` + - :code:`race_ethnicity` - The exhaustive and mutually exclusive categories for the single composite "race/ethnicity" indicator are as follows: White; Black; Latino; American Indian and Alaskan Native (AIAN); Asian; Native Hawaiian and Other Pacific Islander (NHOPI); and - Multiracial or Some Other Race. + Multiracial or Some Other Race. Current Population Survey (CPS) ------------------------------- -CPS is another household survey that can be simulated using Pseudopeople. CPS is conducted jointly by the US Census Bureau and the US -Bureau of Labor Statistics. CPS collects labor force data, such as annual work activity and income, veteran status, school enrollment, -contingent employment, worker displacement, job tenure, and more. To find out more about CPS, please visit the -`CPS homepage `_. +CPS is another household survey that can be simulated using Pseudopeople. CPS is conducted jointly by the US Census Bureau and the US +Bureau of Labor Statistics. CPS collects labor force data, such as annual work activity and income, veteran status, school enrollment, +contingent employment, worker displacement, job tenure, and more. To find out more about CPS, please visit the +`CPS homepage `_. pseudopeople can generate CPS data for a user-specified year, which will include records from simulated surveys conducted @@ -191,62 +191,62 @@ The following columns are included in this dataset: - Notes * - Unique simulant ID - :code:`simulant_id` - - Not affected by noise functions; intended use is "ground truth" for testing and validation. + - Not affected by noise functions; intended use is "ground truth" for testing and validation. * - Unique household ID - :code:`household_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation; consistent across all datasets. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Age - - :code:`age` + - :code:`age` - Rounded down to an integer. * - Date of birth - :code:`date_of_birth` - Formatted as YYYY-MM-DD. * - Physical address street number - :code:`street_number` - - + - * - Physical address street name - :code:`street_name` - - + - * - Physical address unit number - :code:`unit_number` - - + - * - Physical address city - - :code:`city` - - + - :code:`city` + - * - Physical address state - - :code:`state` - - + - :code:`state` + - * - Physical address ZIP code - :code:`zipcode` - - - * - Sex - - :code:`sex` + - + * - Sex + - :code:`sex` - Binary; "male" or "female" * - Race/ethnicity - - :code:`race_ethnicity` + - :code:`race_ethnicity` - The exhaustive and mutually exclusive categories for the single composite "race/ethnicity" indicator are as follows: White; Black; Latino; American Indian and Alaskan Native (AIAN); Asian; Native Hawaiian and Other Pacific Islander (NHOPI); and - Multiracial or Some Other Race. + Multiracial or Some Other Race. Women, Infants, and Children (WIC) ---------------------------------- The Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) is a government benefits program designed to support mothers and young -children. The main qualifications are income and the presence of young children in the home. To find out more about this service, please visit the `WIC +children. The main qualifications are income and the presence of young children in the home. To find out more about this service, please visit the `WIC homepage `_. -pseudopeople can generate a simulated version of the administrative data that would be recorded by WIC. This is a yearly file of information about all +pseudopeople can generate a simulated version of the administrative data that would be recorded by WIC. This is a yearly file of information about all simulants enrolled in the program as of the end of that year. For the final year available, 2041, the file includes those enrolled as of May 1st, because this is the end of our simulated timespan. @@ -262,52 +262,52 @@ The following columns are included in this dataset: - Notes * - Unique simulant ID - :code:`simulant_id` - - Not affected by noise functions; intended use is "ground truth" for testing and validation. + - Not affected by noise functions; intended use is "ground truth" for testing and validation. * - Unique household ID - :code:`household_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation; consistent across all datasets. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Age - - :code:`age` + - :code:`age` - Rounded down to an integer. * - Date of birth - :code:`date_of_birth` - Formatted as MMDDYYYY. * - Physical address street number - :code:`street_number` - - + - * - Physical address street name - :code:`street_name` - - + - * - Physical address unit number - :code:`unit_number` - - + - * - Physical address city - - :code:`city` - - + - :code:`city` + - * - Physical address state - - :code:`state` - - + - :code:`state` + - * - Physical address ZIP code - :code:`zipcode` - - - * - Sex - - :code:`sex` + - + * - Sex + - :code:`sex` - Binary; "male" or "female" * - Race/ethnicity - - :code:`race_ethnicity` + - :code:`race_ethnicity` - The exhaustive and mutually exclusive categories for the single composite "race/ethnicity" indicator are as follows: White; Black; Latino; American Indian and Alaskan Native (AIAN); Asian; Native Hawaiian and Other Pacific Islander (NHOPI); and - Multiracial or Some Other Race. + Multiracial or Some Other Race. Social Security Administration @@ -336,18 +336,18 @@ The following columns are included in this dataset: - Notes * - Unique simulant ID - :code:`simulant_id` - - Not affected by noise functions; intended use is "ground truth" for PRL tracking. + - Not affected by noise functions; intended use is "ground truth" for PRL tracking. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Age - - :code:`age` + - :code:`age` - Rounded down to an integer. * - Date of birth - :code:`date_of_birth` @@ -358,15 +358,15 @@ The following columns are included in this dataset: However, it can be :ref:`configured ` to have noise if desired. * - Date of event - :code:`event_date` - - Formatted as YYYYMMDD. + - Formatted as YYYYMMDD. * - Type of event - :code:`event_type` - - Possible values are "Creation" and "Death". + - Possible values are "Creation" and "Death". Tax forms: W-2 & 1099 --------------------- -Administrative data reported in annual tax forms, such as W-2s and 1099s, can also be simulated by Pseudopeople. 1099 forms are used for independent +Administrative data reported in annual tax forms, such as W-2s and 1099s, can also be simulated by Pseudopeople. 1099 forms are used for independent contractors or self-employed individuals, while a W-2 form is used for employees (whose employer withholds payroll taxes from their earnings). pseudopeople can generate a simulated version of the data collected by W-2 and 1099 forms. @@ -390,65 +390,65 @@ The following columns are included in these datasets: * - Unique household ID - :code:`household_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation; consistent across all - datasets. + datasets. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Mailing address street number - :code:`mailing_address_street_number` - - + - * - Mailing address street name - :code:`mailing_address_street_name` - - + - * - Mailing address unit number - :code:`mailing_address_unit_number` - - + - * - Mailing address city - - :code:`mailing_address_city` - - + - :code:`mailing_address_city` + - * - Mailing address state - - :code:`mailing_address_state` - - + - :code:`mailing_address_state` + - * - Mailing address ZIP code - :code:`mailing_address_zipcode` - - - * - Social security number + - + * - Social security number - :code:`ssn` - - - * - Income + - + * - Income - :code:`income` - - - * - Employer ID + - + * - Employer ID - :code:`employer_id` - - - * - Employer Name + - + * - Employer Name - :code:`employer_name` - - + - * - Employer street number - :code:`employer_street_number` - - + - * - Employer street name - :code:`employer_street_name` - - + - * - Employer unit number - :code:`employer_unit_number` - - + - * - Employer city - - :code:`employer_city` - - + - :code:`employer_city` + - * - Employer state - - :code:`employer_state` - - + - :code:`employer_state` + - * - Employer ZIP code - :code:`employer_zipcode` - - - * - Type of tax form + - + * - Type of tax form - :code:`tax_form` - Possible values are "W2" or "1099". @@ -476,85 +476,85 @@ The following columns are included in this dataset: * - Unique household ID - :code:`household_id` - Not affected by noise functions; intended use is "ground truth" for testing and validation; consistent across all - datasets. + datasets. * - First name - :code:`first_name` - - + - * - Middle initial - :code:`middle_initial` - - + - * - Last name - :code:`last_name` - - + - * - Mailing address street number - :code:`mailing_address_street_number` - - + - * - Mailing address street name - :code:`mailing_address_street_name` - - + - * - Mailing address unit number - :code:`mailing_address_unit_number` - - + - * - Mailing address PO box - :code:`mailing_address_po_box` - - + - * - Mailing address city - - :code:`mailing_address_city` - - + - :code:`mailing_address_city` + - * - Mailing address state - - :code:`mailing_address_state` - - + - :code:`mailing_address_state` + - * - Mailing address ZIP code - :code:`mailing_address_zipcode` - - + - * - Social Security Number (SSN) - :code:`ssn` - Individual Taxpayer Identification Number (ITIN) if no SSN * - Joint filer first name - :code:`spouse_first_name` - - + - * - Joint filer middle initial - :code:`spouse_middle_initial` - - + - * - Joint filer last name - :code:`spouse_last_name` - - + - * - Joint filer social security number - :code:`spouse_ssn` - Individual Taxpayer Identification Number (ITIN) if no SSN * - Dependent 1 first name - :code:`dependent_1_first_name` - - + - * - Dependent 1 last name - :code:`dependent_1_last_name` - - + - * - Dependent 1 Social Security Number (SSN) - :code:`dependent_1_ssn` - - Individual Taxpayer Identification Number (ITIN) if no SSN + - Individual Taxpayer Identification Number (ITIN) if no SSN * - Dependent 2 first name - :code:`dependent_2_first_name` - - + - * - Dependent 2 last name - :code:`dependent_2_last_name` - - + - * - Dependent 2 social security number - :code:`dependent_2_ssn` - - Individual Taxpayer Identification Number (ITIN) if no SSN + - Individual Taxpayer Identification Number (ITIN) if no SSN * - Dependent 3 first name - :code:`dependent_3_first_name` - - + - * - Dependent 3 last name - :code:`dependent_3_last_name` - - + - * - Dependent 3 social security number - :code:`dependent_3_ssn` - - Individual Taxpayer Identification Number (ITIN) if no SSN + - Individual Taxpayer Identification Number (ITIN) if no SSN * - Dependent 4 first name - :code:`dependent_4_first_name` - - + - * - Dependent 4 last name - :code:`dependent_4_last_name` - - + - * - Dependent 4 social security number - :code:`dependent_4_ssn` - - Individual Taxpayer Identification Number (ITIN) if no SSN \ No newline at end of file + - Individual Taxpayer Identification Number (ITIN) if no SSN From fb1e2372f5bc05116f7fa54bfa3a362b5a274914 Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Wed, 19 Jul 2023 14:33:22 -0700 Subject: [PATCH 02/11] add housing type to decennial census --- docs/source/datasets/index.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index a7b5464c..af0781c7 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -84,6 +84,13 @@ The following columns are included in this dataset: * - Physical address ZIP code - :code:`zipcode` - + * - Housing type + - :code:`housing_type` + - Possible values for housing type are "Household" for an individual + household, or one of six different types of group quarters. The types of + instiutional group quarters are "Carceral", "Nursing home", and "Other + institutional". The types of non-institutional group quarters are + "College", "Military", and "Other non-institutional". * - Relationship to reference person - :code:`relationship_to_reference_person` - Possible values for this indicator include: From a5ec535b18ccc8e6b5003ccd09da7a04504306fa Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Wed, 19 Jul 2023 14:57:53 -0700 Subject: [PATCH 03/11] add housing type and relation to reference person to ACS --- docs/source/datasets/index.rst | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index af0781c7..5b5a3902 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -166,6 +166,22 @@ The following columns are included in this dataset: * - Physical address ZIP code - :code:`zipcode` - + * - Housing type + - :code:`housing_type` + - Possible values for housing type are "Household" for an individual + household, or one of six different types of group quarters. The types of + instiutional group quarters are "Carceral", "Nursing home", and "Other + institutional". The types of non-institutional group quarters are + "College", "Military", and "Other non-institutional". + * - Relationship to reference person + - :code:`relationship_to_reference_person` + - Possible values for this indicator include: + "Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried + partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological + child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; + "Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate"; "Foster + child"; "Other nonrelative"; "Institutionalized group quarters + population"; and "Noninstitutionalized group quarters population". * - Sex - :code:`sex` - Binary; "male" or "female" From 0be77368504e910757033778584e29527bf918ac Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Wed, 19 Jul 2023 14:59:58 -0700 Subject: [PATCH 04/11] remove hyphens from noninstitutional --- docs/source/datasets/index.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index 5b5a3902..a183b22d 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -89,8 +89,8 @@ The following columns are included in this dataset: - Possible values for housing type are "Household" for an individual household, or one of six different types of group quarters. The types of instiutional group quarters are "Carceral", "Nursing home", and "Other - institutional". The types of non-institutional group quarters are - "College", "Military", and "Other non-institutional". + institutional". The types of noninstitutional group quarters are + "College", "Military", and "Other noninstitutional". * - Relationship to reference person - :code:`relationship_to_reference_person` - Possible values for this indicator include: @@ -171,8 +171,8 @@ The following columns are included in this dataset: - Possible values for housing type are "Household" for an individual household, or one of six different types of group quarters. The types of instiutional group quarters are "Carceral", "Nursing home", and "Other - institutional". The types of non-institutional group quarters are - "College", "Military", and "Other non-institutional". + institutional". The types of noninstitutional group quarters are + "College", "Military", and "Other noninstitutional". * - Relationship to reference person - :code:`relationship_to_reference_person` - Possible values for this indicator include: From c7cb9e31020e70fed911c9b4690edc5d894b3303 Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Wed, 19 Jul 2023 15:09:02 -0700 Subject: [PATCH 05/11] let Atom remove whitespace --- docs/source/noise/index.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/source/noise/index.rst b/docs/source/noise/index.rst index d7910bb8..83a2de9d 100644 --- a/docs/source/noise/index.rst +++ b/docs/source/noise/index.rst @@ -9,7 +9,7 @@ to add noise to the simulated data. "Noise" refers to various types of errors introduced into the data and may also be called "corruption" or "distortion." By default, pseudopeople applies noise to the simulated datasets using some reasonable settings. If desired, the user can change the noise settings through -the configuration system---see the :ref:`Configuration section ` +the configuration system---see the :ref:`Configuration section ` for details. .. contents:: @@ -46,18 +46,18 @@ Noise types are applied in the order they are listed here. The "Config key" column shows the name of the noise type in the :ref:`configuration system `. .. list-table:: Types of row-based noise (``row_noise``) - :widths: 1 2 5 + :widths: 1 2 5 :header-rows: 1 * - Noise type - - Config key + - Config key - Example cause * - Omit a row - ``omit_row`` - Neglecting to file a tax form on time .. list-table:: Types of column-based noise (``column_noise``) - :widths: 1 2 5 + :widths: 1 2 5 :header-rows: 1 * - Noise type @@ -84,7 +84,7 @@ The "Config key" column shows the name of the noise type in the :ref:`configurat * - Write the wrong digits - ``write_wrong_digits`` - Writing "732 Main St" as your street address instead of "932 Main St" - * - Write the wrong ZIP code digits + * - Write the wrong ZIP code digits - ``write_wrong_zipcode_digits`` - Writing ZIP code 98118 when you actually live in 98112 * - Swap month and day @@ -92,7 +92,7 @@ The "Config key" column shows the name of the noise type in the :ref:`configurat - Reporting 17/05/1976 when a survey asks for the date in MM/DD/YYYY format * - Make typos - ``make_typos`` - - Accidentally typing an "l" instead of a "k" because they are + - Accidentally typing an "l" instead of a "k" because they are right next to each other on a QWERTY keyboard * - Make Optical Character Recognition (OCR) errors - ``make_ocr_errors`` @@ -155,7 +155,7 @@ Noise types for each column - Noise for all types of addresses works in the same way * - State for any address (physical, mailing, or employer) - Decennial Census, ACS, CPS, WIC, W-2 and 1099 - - Leave a field blank, choose the wrong option + - Leave a field blank, choose the wrong option - Noise for all types of addresses works in the same way * - ZIP code for any address (physical, mailing, or employer) - Decennial Census, ACS, CPS, WIC, W-2 and 1099 @@ -184,7 +184,7 @@ Noise types for each column * - Employer ID - W-2 and 1099 - Leave a field blank, write the wrong digits, make typos, make OCR errors - - + - * - Employer name - W-2 and 1099 - Leave a field blank, make typos, make OCR errors From a1bc1bc61661ac14320a9a35475c5bcb2caeaa49 Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Wed, 19 Jul 2023 15:11:34 -0700 Subject: [PATCH 06/11] add new columns to noise table --- docs/source/noise/index.rst | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/source/noise/index.rst b/docs/source/noise/index.rst index 83a2de9d..d01e1a34 100644 --- a/docs/source/noise/index.rst +++ b/docs/source/noise/index.rst @@ -161,8 +161,12 @@ Noise types for each column - Decennial Census, ACS, CPS, WIC, W-2 and 1099 - Leave a field blank, write the wrong zipcode digits, make typos, make OCR errors - + * - Housing type + - Decennial Census, ACS + - Leave a field blank, choose the wrong option + - * - Relationship to reference person - - Decennial Census + - Decennial Census, ACS - Leave a field blank, choose the wrong option - * - Sex From ccc7e9652d7ddbe91461715d853f41bef7c60050 Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Fri, 21 Jul 2023 15:24:19 -0700 Subject: [PATCH 07/11] Fix typo Co-authored-by: Zeb Burke-Conte --- docs/source/datasets/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index a183b22d..24e1c58d 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -88,7 +88,7 @@ The following columns are included in this dataset: - :code:`housing_type` - Possible values for housing type are "Household" for an individual household, or one of six different types of group quarters. The types of - instiutional group quarters are "Carceral", "Nursing home", and "Other + institutional group quarters are "Carceral", "Nursing home", and "Other institutional". The types of noninstitutional group quarters are "College", "Military", and "Other noninstitutional". * - Relationship to reference person From 45738f3d71270533382917ec8c1ff802e859fd7a Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Fri, 21 Jul 2023 15:24:31 -0700 Subject: [PATCH 08/11] Fix typo Co-authored-by: Zeb Burke-Conte --- docs/source/datasets/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index 24e1c58d..cf6eeb11 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -170,7 +170,7 @@ The following columns are included in this dataset: - :code:`housing_type` - Possible values for housing type are "Household" for an individual household, or one of six different types of group quarters. The types of - instiutional group quarters are "Carceral", "Nursing home", and "Other + institutional group quarters are "Carceral", "Nursing home", and "Other institutional". The types of noninstitutional group quarters are "College", "Military", and "Other noninstitutional". * - Relationship to reference person From 99c65f0c0bf7075d03eeda683ca9bcd510800d58 Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Fri, 21 Jul 2023 15:32:36 -0700 Subject: [PATCH 09/11] change relationship categories in census to match ACS --- docs/source/datasets/index.rst | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index cf6eeb11..29db4fe3 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -94,8 +94,12 @@ The following columns are included in this dataset: * - Relationship to reference person - :code:`relationship_to_reference_person` - Possible values for this indicator include: - Reference person; Biological child; Adopted child; Stepchild; Sibling; Parent; Grandchild; Parent-in-law; Child-in-law; Other relative; - Roommate; Foster child; Other nonrelative; Noninstitutionalized GQ pop; and Institutionalized GQ pop. + "Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried + partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological + child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; + "Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate"; "Foster + child"; "Other nonrelative"; "Institutionalized group quarters + population"; and "Noninstitutionalized group quarters population". * - Sex - :code:`sex` - Binary; "male" or "female". From 73d6105d0dc55d411456a23fc68b7d05f91af570 Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Fri, 21 Jul 2023 15:38:31 -0700 Subject: [PATCH 10/11] indicator -> field --- docs/source/datasets/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index 29db4fe3..6f41a61e 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -93,7 +93,7 @@ The following columns are included in this dataset: "College", "Military", and "Other noninstitutional". * - Relationship to reference person - :code:`relationship_to_reference_person` - - Possible values for this indicator include: + - Possible values for this field include: "Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; @@ -179,7 +179,7 @@ The following columns are included in this dataset: "College", "Military", and "Other noninstitutional". * - Relationship to reference person - :code:`relationship_to_reference_person` - - Possible values for this indicator include: + - Possible values for this field include: "Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; From ca717075eb0633c10fdd255ab4b990606ebba42c Mon Sep 17 00:00:00 2001 From: Nathaniel Blair-Stahn Date: Fri, 21 Jul 2023 16:04:58 -0700 Subject: [PATCH 11/11] Roommate -> Roommate or housemate --- docs/source/datasets/index.rst | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/docs/source/datasets/index.rst b/docs/source/datasets/index.rst index d993ebbc..e95bfe94 100644 --- a/docs/source/datasets/index.rst +++ b/docs/source/datasets/index.rst @@ -97,9 +97,10 @@ The following columns are included in this dataset: "Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; - "Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate"; "Foster - child"; "Other nonrelative"; "Institutionalized group quarters - population"; and "Noninstitutionalized group quarters population". + "Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate or + housemate"; "Foster child"; "Other nonrelative"; "Institutionalized group + quarters population"; and "Noninstitutionalized group quarters + population". * - Sex - :code:`sex` - Binary; "male" or "female". @@ -183,9 +184,10 @@ The following columns are included in this dataset: "Reference person"; "Opposite-sex spouse"; "Opposite-sex unmarried partner"; "Same-sex spouse"; "Same-sex unmarried partner"; "Biological child"; "Adopted child"; "Stepchild"; "Sibling"; "Parent"; "Grandchild"; - "Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate"; "Foster - child"; "Other nonrelative"; "Institutionalized group quarters - population"; and "Noninstitutionalized group quarters population". + "Parent-in-law"; "Child-in-law"; "Other relative"; "Roommate or + housemate"; "Foster child"; "Other nonrelative"; "Institutionalized group + quarters population"; and "Noninstitutionalized group quarters + population". * - Sex - :code:`sex` - Binary; "male" or "female"