This data contains COVID+ patients within the N3C Data Enclave with diagnosis dates on or after December 22, 2021 (the date that Paxlovid was authorized for use by the FDA).
A COVID-positive patient is defined as any patient having one of the following within their EHR records:
- Laboratory confirmed positive COVID-19 PCR or Antigen test
- Laboratory confirmed positive COVID-19 Antibody test
- Medical visit in which the ICD-10 code for COVID-19 (U07.1) was recorded
- Condition diagnosis patients have no record of a positive PCR/Antigen or Antibody test within their EHR; however, they were diagnosed with COVID due to the symptoms they displayed.
A Charlson Comorbidity Index (CCI) Score is calculated by identifying any CCI comorbidity attached to a medical visit within a patient’s EHR and weighing each comorbidity by its associated category. The sum of all the weights results in a single comorbidity score. More details on the CCI and its 17 associated comorbidities can be found here.
- The calculated score is as of a patient’s COVID-19 diagnosis date.
- A patient with undiagnosed CCI conditions not recorded in their EHR would not be represented here. Additionally, a patient may have a CCI condition for which they have not required a medical visit, which would exclude them from representation.
Severity of COVID-19 is a calculation based on multiple events recorded in a patient's EHR during their medical visit. The severity score for each patient may be inaccurate due to missing information within the EHR. Patients will only be graded on Severity if they have a laboratory-confirmed positive PCR or Antigen test. Below are the definitions of each Severity Category.
- Mild - The patient has no record of Emergency Room visits or hospitalization for COVID-19
- ED Visit (not admitted) - The patient had an Emergency Room (ER) visit for COVID-19, but we have no record of hospitalization (Inpatient) for COVID-19
- Moderate Hospitalized - The patient was hospitalized (Inpatient visit) for COVID-19 AND did not receive ECMO OR Invasive Ventilation
- Mortality - The patient’s records show a date of death
- Severe Ventilation/ECMO/AKI - The patient was hospitalized for COVID-19 AND received Extracorporeal membrane oxygen (ECMO) OR received Invasive Ventilation
- Unavailable - Patients who did not have a lab-confirmed positive PCR or Antigen COVID test. This includes patients who were diagnosed only based on the symptoms they displayed or patients who do not have any recorded COVID-19 diagnosis within the Enclave.
The age of each patient is calculated as of the date of the last data update.
- If an age exceeds 89, it will be obscured using a date shift of +/- 10 years.
- As of 7/15/22, July 1st is used as a placeholder date of birth when there are 0s or nulls in the OMOP person table to avoid biasing towards older age.
The race and ethnicity of patients are adjusted to standard categories based on self-reported fields within the EHR.
- Note that the EHRs do not always contain all of the information on race and ethnicity, and patients may not self-report a response that can be fully mapped into one of the standard categories. The patient would fall into the "Unknown" category in these cases.
The sex of patients is determined based on self-reported fields within the EHR.
- Note that the EHRs do not always contain all of the information on sex; if a patient's EHR does not contain data on their sex, they will fall into the "Unknown" category.
- If a patient records any response other than "Female" or "Male" they would be mapped into the "Other" category.
General Enclave Limitations
- “Sicker” patients will likely be overrepresented within the N3C Data Enclave, as sicker patients will more often seek out and receive care at clinical centers.
- The N3C may have multiple contributors to data “missingness”. Clinical facts and events that occur in the real world may not be captured for reasons including:
- The event was recorded at a clinical site that does not contribute data
- Data is not yet linked across sites
- Medical records are inherently incomplete
- Some of the external datasets that have been used for analysis cannot be fully mapped due to issues such as missing measurement units.
- All dates within the Enclave have been shifted between -3 to 45 days to ensure that reidentification is not possible.
- N3C data may not be representative of the entire US population
- N3C does NOT have a representative sample of any state, as data is contributed from only a few providers in each region (Region - includes multiple states).
- Cell sizes smaller than 20 people have been suppressed
- For COVID+ patients: A patient is only counted once in this data, even if they have multiple positive tests over time. Except in instances where dashboards focus on reinfection, only dates of first infection are utilized.