Skip to content
karafecho edited this page Aug 5, 2024 · 9 revisions

Back to Home

Integrated Clinical and Environmental Exposures Service (ICEES)

Brief Overview

Description: ICEES Knowledge Graph (KG) is an open service that exposes clinical data (i.e., electronic health records, clinical study data) that have been integrated at the patient level with public exposures data (e.g., airborne pollutants, major roadways/highways, concentrated animal feeding operations, landfills), with pairwise positive and negative correlations between feature variables reported on edges.

Example edge (interpretation): ICEES KG shows that cystic fibrosis is correlated with average daily exposure to PM2.5, with a Chi Square statistic of 19.55 and a P value of 0.0006 (N = 1296) in a cohort of patients from UNC Health in year 2010.

Data source(s): ICEES KG exposes data from varied sources, including electronic health record data, clinical study data, and environmental exposures data.

Key methodologic metrics: ICEES KG provides pairwise correlations between feature variables, with Chi Square statistic, Chi Square P value, odds ratio, log odds ratios, 95% confidence interval for log odds ratio, Fisher’s exact P value for log odds ratio, and sample size.

Regulatory requirement(s) and/or licensing restriction(s): Service is compliant with all federal and institutional regulations.

Detailed Description

ICEES provides a regulatory-compliant, open framework and approach for exposing and exploring sensitive patient data (e.g., electronic health records, clinical research data, survey data) that have been integrated with a variety of public environmental exposures data (e.g., airborne pollutants, major roadways/highways, landfills, concentrated animal farming operations, socio-environmental indicators). The design of ICEES is use-case driven, which means that different ICEES endpoints provision data on different cohorts and data elements, albeit with overlap in certain cases.

ICEES is accessible through two general services: (1) ICEES+ is fully featured and supports functionalities such as dynamic cohort creation and exploratory bivariate and multivariate analysis; and (2) ICEES KG is static and supports knowledge graph queries over a pre-computed correlational matrix that provides pairwise comparisons between feature variables. Both ICEES+ and ICEES KG support multiple use cases. Users of both services are provided with a total sample size and statistical metrics on precomputed correlations: Chi Square statistic, degrees of freedom, and P value; Fisher's Exact odds ratio, Fisher's Exact P value, log odds ratio, and 95% confidence interval for the log odds ratio. Note that ICEES KG cohorts are described in edge attributes by file name. For example, cohort "PCD_UNC_patient_2014_v6_binned_deidentified|pcd|v6|2023_02_06_16_21_25" describes a cohort focused on primary ciliary dyskinesia from patients at UNC Health in year 2014; the remainder of the file name indicates that the data are binned and deidentified, indicating full regulatory and institutional compliance, and derived from a v6 dataset generated in 2023.

Example Observations The ICEES KG shows that asthma is positively correlated with fexofenadine, with a log odds ratio of 2.15 (95% confidence interval: [1.99, 2.32]; N=157,412) in a cohort of patients from UNC Health in year 2014. ICEES KG shows that cystic fibrosis is correlated with average daily exposure to PM2.5, with a Chi Square statistic of 19.55 and a P value of 0.0006 (N = 1296) in a cohort of patients from UNC Health in year 2010. ICEES KG shows that cystic fibrosis is positively correlated with cetirizine, with a log odds ratio of 1.96 (95% confidence interval: [1.54, 2.37]; N = 5688) in a cohort of patients from UNC Health in year 2016.

Select References

Fecho K, Pfaff E, Xu H, Champion J, Cox S, Stillwell L, Bizon C, Peden D, Krishnamurthy A, Tropsha A, Ahalt SC. A novel approach for exposing and sharing clinical data: the Translator Integrated Clinical and Environmental Exposures Service. J Am Med Inform Assoc 2019;26(10):1064–1073. doi: 10.1093/jamia/ocz042.

Fecho K,* Haaland P, Krishnamurthy A, Lan B, Ramsey S, Schmitt PL, Sharma P, Sinha M, Xu H. An approach for open multivariate analysis of integrated clinical and environmental exposures data. Inform Med Unlocked 2021;26:100733. doi.org/10.1016/j.imu.2021.100733. *Apart from first/lead author, all other authors are listed in alphabetical order.

Lan B,* Haaland P, Krishnamurthy A, Peden DB, Schmitt PL, Sharma P, Sinha M, Xu H, Fecho K. Open application of statistical and machine learning models to explore the impact of environmental exposures on health and disease: an asthma use case. Int J Environ Res Public Health 2021;18(21):11398 [published as part of a special issue titled “Application of Biostatistical Modelling in Public Health and Epidemiology”]. doi: 10.3390/ijerph182111398. *Apart from first/lead and last/senior author, all other authors are listed in alphabetical order.

Fecho K,* Ahalt SC, Appold S, Arunachalam S, Pfaff E, Stillwell L, Valencia A, Xu H, Peden D. Development and application of an open tool for sharing and analyzing integrated clinical and environmental exposures data: asthma use case. JMIR Form Res 2022;6(4):e32357. doi: 10.2196/32357. *Apart from first/lead and last/senior author, all other authors are listed in alphabetical order.

Fecho K, Ahalt SC, Knowles M, Krishnamurthy A, Leigh M, Morton K, Pfaff E, Wang M, Yi H. Leveraging open electronic health record data and environmental exposures data to derive insights into rare pulmonary disease. Front Artif Intell 2022; 5:918888 (special issue on Biomedical Informatics Applications in Rare Diseases). doi: 10.3389/frai.2022.918888. *Apart from the first author, all authors are listed in alphabetical order.

Sharma P, Haaland P, Krishnamurthy A, Lan B, Schmitt PL, Sinha M, Xu H, Fecho K. Evaluating robustness of a generalized linear model when applied to electronic health record data accessed using an openAPI. Health Informatics J 2023;29(2):April-June 2023. *Apart from first/lead and last/senior author, all other authors are listed in alphabetical order. doi: 10.1177/14604582231170892.

Sinha M, Haaland P, Krishnamurthy A, Lan B, Ramsey SA, Schmitt PL, Sharma P, Xu H, Fecho K. Causal analysis for multivariate integrated clinical and environmental exposures data. BMC Medical Informatics and Decision Making, under review. medRxiv preprint is available here: https://www.medrxiv.org/content/10.1101/2022.12.20.22283734v1.

External Documentation

ICEES+

*The link above takes users to a web page that includes a user manual; however, please note that the page is hosted on a site that may contain outdated information on other pages.

Use Cases

  • Asthma and related common respiratory disorders
  • Primary ciliary dyskinesia and related rare respiratory disorders
  • Drug-induced liver injury
  • Coronavirus infection

Supporting Data Sources

Terms and Conditions of Use

  • ICEES+ and ICEES KG terms and conditions of use can be accessed here.

GitHub Repositories and Technical User Guides

Modes of Access

TRAPI endpoint:

ICEES+ endpoints:

ICEES+ SMC OpenAPI endpoints:

Environmental Exposures OpenAPIs

Issues should be posted in the ICEES+ GitHub repository or the ICEES KG GitHub repository.

CAM/AOP KP

Exposures Provider also supports the CAM (Causal Activity Model)/AOP (Adverse Outcome Pathway) Knowledge Provider (KP), described at CAM Provider KG.

Clone this wiki locally