Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Household structure variables from the PSID #6

Open
jdebacker opened this issue Oct 28, 2020 · 4 comments
Open

Household structure variables from the PSID #6

jdebacker opened this issue Oct 28, 2020 · 4 comments

Comments

@jdebacker
Copy link
Member

@MaxGhenis and @prrathi are working on an estimate of the number of children per household. The current version of the psid_download.R script in this repo pulls a few related variables:

                        # Demographics
                        head_age="ER17013",
                        spouse_age="ER47319",
                        head_gender="ER47318",
                        head_num_children="V10977",
                        num_children="ER37724",
                        num_children_away_from_home="V561",
                        num_children_under18="ER47320",

Are there additional variables you two would like pulled? FYI, PSID variable search is here.

@jdebacker jdebacker changed the title Variables from the PSID Household structure variables from the PSID Oct 28, 2020
@MaxGhenis
Copy link
Contributor

Thanks @jdebacker, what's currently in there should be a good start. Do you know if PSID family units also include elderly dependents? I couldn't find fields on this, and seems like the biggest potential gap.

The cleanest would probably be using individual ages and then merging back to the family unit level, but might not be worth it.

@MaxGhenis
Copy link
Contributor

To be specific, here's how I think we can calculate each metric used for the UBI:

nu18 = (head_age < 18) + (spouse_age < 18) + num_children_under18
# Temp: assume all adult children are under 65.
num_adult_children = num_children - num_children_under18
n1864 = head_age.between(18, 64) + spouse_age.between(18, 64) + num_adult_children
n65 = (head_age > 64) + (spouse_age > 64)

Another consideration is that taxcalc currently also has a n1820 for UBI policies that give a different amount to this age group. Parity would require using individual ages. I don't think this is too important though, it's probably just for one semi-prominent UBI proposal (Charles Murray's).

@MaxGhenis
Copy link
Contributor

num_children is null for 98% of records, so we might have to just use num_children_under18 and skip adult dependents, at least to start.

!wget https://github.com/jdebacker/OG-USA-Calibration/raw/master/EarningsProcesses/psid_data_files/psid1968to2015.RData
psid = pyreadr.read_r('psid1968to2015.RData')['psid_df']
psid[['head_age', 'spouse_age', 'num_children_under18', 'num_children']].isna().mean()
head_age                0.000000
spouse_age              0.000000
num_children_under18    0.000000
num_children            0.982549
dtype: float64

@MaxGhenis
Copy link
Contributor

I think we can close this and discuss in #9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants