-
Notifications
You must be signed in to change notification settings - Fork 1
Preparing clinical texts for N3C Enclave
This page documents the technical detail of contributing extracted NLP concepts to the N3C Enclave. For the standard operation procedure (SOP) as an N3C data contribution site, please refer to the guide from the N3C Phenotype & Data Acquisition Workstream and the N3C Data Ingestion and Harmonization.
Please follow the documentation of OHNLP Backbone to prepare the clinical texts for COVID-19 related concept extraction.
After the clinical concepts are extracted and stored in the OMOP CDM NOTE_NLP
table via OHNLP Backbone, the OHDSI CDM NOTE
, NOTE_NLP
table shared as CSV files.
Note that person_id
, visit_occurrence_id
in the CDM NOTE
table should match those used in the other data files in your submission. Per N3C Requests, visit_occurrence_id
cannot have a non-matching key.
Data Schema is as follows - note the below for what columns should be truncated for PHI:
NOTE
table
-
note_id
: string -
person_id
: bigint - corresponding to the de-identifiedperson_id
in the rest of your submission -
note_date
: timestamp (ISO8601 Compliant yyyy-MM-dd'T'HH:mm:ssZ) -
note_datetime
: timestamp (ISO8601 Compliant) -
note_type_concept_id
: int -
note_class_concept_id
: int -
note_title
: string (truncated due to PHI concerns) -
note_text
: string (truncated due to PHI concerns) -
encoding_concept_id
: int -
language_concept_id
: int -
provider_id
: bigint (truncated due to PHI) -
visit_occurrence_id
: bigint - must match the rest of your submission -
visit_detail_id
: null -
note_source_value
: string (truncated due to PHI)
NOTE_NLP
table
-
note_nlp_id
: int -
note_id
: string -
section_concept_id
: int -
snippet
: string (truncated due to PHI) -
offset
: int -
lexical_Variant
: (truncated due to PHI) -
note_nlp_concept_id
: int -
note_nlp_source_concept_id
: int (blank) -
nlp_system
: string -
nlp_date/nlp_datetime
: timestamp (ISO8601 Compliant) -
term_exists
: varchar(1) – Y/N -
term_temporal
: string -
term_modifiers
: string
For the definition of these columns, please refer to the OHDSI CDM documentation of the NOTE and NOTE_NLP tables.
This site including its contents of concept glossary, risk factors and architecture is a demonstration of work-in-progress of the N3C and OHNLP groups. The contents of the page is under Apache License Version 2.0.