Data and Logic Liaisons support clinicians, analysts, and data scientists who want to conduct research on the harmonized clinical data available through the N3C Data Enclave. In order to perform research, users need to identify key variables for analysis. These key variables are generated through Code Workbooks and Templates that utilize specific Concept Sets (lists of key variables from constituent vocabularies), that identify and extract data to answer research questions. Through interaction with Domain Teams, the Data and Logic Liaisons continually develop and refine a core set of N3C Recommended Concept Sets and code templates that generate commonly used variables and support efficient customization by research teams.
Data Liaison Team
Leadership: Christopher Chute (Johns Hopkins University)
Team: Harold Lehmann MD PhD, Richard Zhu MD, Tanner Zhang MD, Stephanie Hong, Sigfried Gold, and Lisa Eskenazi (Johns Hopkins University) and Timothy Bergquist (Sage Bionetworks).
Logic Liaison Team
Leadership: Johanna Loomba (University of Virginia) and Richard Moffitt (Emory)
Lead Template Developers: Andrea Zhou (University of Virginia) and Evan French (Virginia Commonwealth University).
Other core members and code reviewers: Steve Johnson (University of Minnesota), Alfred (Jerrod) Anzalone (University of Nebraska), Amy Olex (Virginia Commonwealth University)
Access to Data Liaison and Logic Liaison Tools and Services
The community tools developed by the liaison teams are listed below are linked from the enclave home page and can help researchers leverage established concept sets and code templates as well as how to create custom derived fact tables for their research questions. Data Liaisons and Logic Liaisons have provided training videos regarding use of these tools as well as Community Notes around best practices. Personalized liaison assistance in use of these tools is provided during N3C Office Hours. Support can also received by submitting a help desk technical support ticket in the N3C enclave. The liaisons will send a representative to your Domain Team meetings on an as needed basis for general consultation.
N3C Data and Logic Liaison Community Tools
In collaboration with Palantir, the Data Liaisons have specified and followed the features needed and the best-practice workflow for these tools. The Concept Set Browser is the first step for deciding whether a new concept is needed, as well as for choosing a Concept Set for inclusion into an analysis as a building block for an analytic variable. The Concept Set Editor (accessed via the Browser) is where new Concept Sets are created, either as a modification of existing versions or from scratch. Tutorials are available from each site. (For further documentation, see below.)
This tool can be used to find curated concept sets. The broadest is a Concept Set having the status of “N3C Recommended” have been thoroughly reviewed and deemed appropriate for broad reuse by Domain Team Leads and Data and Logic Liaisons have deem it as needed for broad use and which has been thoroughly reviewed. To find these Concept Sets select the N3C Recommended Concept Set filter in the browser. All other community concept sets with documented vocabulary and clinical reviews can be found by filtering for “Provisionally Approved.” Also accessed via the Concept Set Browser are “Bundles” which are sets of Concept Sets that a Domain Team has asked that data liaisons put together as they are topically related to each other, but have distinct scope or intended use.
Logic Liaisons have built and maintain a phenotype explorer that allows researchers to become familiar with the N3C phenotype by browsing common comorbidities among All N3C Patients as well as the Confirmed and Possible COVID-19 sub-cohorts.
Logic Liaison code templates accelerate N3C analysis by providing commonly used variables and methods to quickly add custom elements. To find these templates, enter “Logic Liaison Templates” into the N3C Knowledge Store search field.
Logic Liaisons develop, disseminate and maintain two master fact templates that each produce visit-level and person-level data frames of commonly used derived variables for All N3C Patients as well as a subset who have an index date for their acute COVID-19 infection, the Confirmed COVID-19 Positive Patients (PCR/AG positive or U07.1 COVID-19 diagnosed). The current fact tables contain over 75 variables including demographics, common comorbidities, vaccination status, observation periods, and data extraction details.
Logic Liaisons also develop, disseminate, and maintain several other types of code templates including:
- Sister templates that feed from the visit-level and person-level datasets of the master fact template to efficiently generate additional derived variables based on broadly requested and applicable logic such as study-specific fact indexing, co-occurrence, and CCI score calculations.
- A template that integrates key Social Determinants of Health features into your fact tables.
- Data quality templates that produce visualizations reflecting the data density of sites by OMOP domain and by study variables. These templates assess both relative density as well as look for systematic missingness of facts from particular sites.