Online Data Collection for Efficient Semiparametric Inference

Overview

This repository contains the code and data for the paper:
Online Data Collection for Efficient Semiparametric Inference [PDF]
Shantanu Gupta, Zachary Lipton, David Childers

Setup

We use miniconda and poetry for managing dependencies. The following commands setup the Python dependencies.

# Install conda environment.
conda install --name oms --file environment.yml
conda activate oms

# Install poetry.
pip install poetry

# Install dependencies.
poetry install

To execute the results in parallel, we use ipyparallel. The following command starts the clusters:

ipcluster start -n <num_engines>

Code and Datasets

The following Jupyter notebooks contain the code for running the experiments in our paper:

nonlinear_iv_LATE_main.ipynb: Code for the experiment for the nonlinear instrumental variable (IV) graph (Figure 3).
jtpa_iv_LATE_main_MLP.ipynb: Code for the experiment with the JTPA dataset (Figure 4a).
copd_data_main.ipynb: Code for the experiment with the COPD dataset (Figure 4b).
linear_iv_LATE_main.ipynb: Code for the experiment for the linear IV graph (Figure 7a).
linear_frontdoor_backdoor_main.ipynb: Code for the experiment for the linear confounder-mediator graph (Figure 7b).
observational_two_covariates_main.ipynb: Code for the experiment for combining two observational datasets (Figure 7c).

We also have the following additional notebooks:

jtpa_data_processing.ipynb: Generates the datasets/jtpa_processed.pkl file used for our experiments.
jtpa_IV_true_LATE.ipynb: Compute the ground-truth LATE for the JTPA dataset.
copd_data_true_ATE.ipynb: Compute the ground-truth ATE for the COPD dataset.

Citation

If you find this code useful, please consider citing our work:

@misc{gupta2024onlinedatacollectionefficient,
      title={Online Data Collection for Efficient Semiparametric Inference}, 
      author={Shantanu Gupta and Zachary C. Lipton and David Childers},
      year={2024},
      eprint={2411.03195},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/2411.03195}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
causal_models		causal_models
datasets		datasets
gmm		gmm
notebooks		notebooks
utils		utils
.gitignore		.gitignore
README.md		README.md
copd_data_main.py		copd_data_main.py
environment.yml		environment.yml
frontdoor_backdoor_main.py		frontdoor_backdoor_main.py
jtpa_iv_late_main.py		jtpa_iv_late_main.py
logistic_iv_LATE_main.py		logistic_iv_LATE_main.py
mlp_iv_LATE_main.py		mlp_iv_LATE_main.py
observational_two_covariates_main.py		observational_two_covariates_main.py
poetry.lock		poetry.lock
predictors.py		predictors.py
pyproject.toml		pyproject.toml
sample_revealer.py		sample_revealer.py
strategies.py		strategies.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Online Data Collection for Efficient Semiparametric Inference

Overview

Setup

Code and Datasets

Citation

About

Languages

shantanu95/online-moment-selection

Folders and files

Latest commit

History

Repository files navigation

Online Data Collection for Efficient Semiparametric Inference

Overview

Setup

Code and Datasets

Citation

About

Resources

Stars

Watchers

Forks

Languages