Demographics

Explore demographics and COVID factors associated with patients within the Enclave. Gain a general overview of all patients or target a specific cohort by associated comorbidities, vaccination/COVID status, etc

Select a Topic to Explore:

Display

   

Filters











Total Patients in Enclave

22.85M

Data as of Aug 08, 2024 (v184)

Total Patients in View

22.83M

Severity of N3C Patients

load

Age of N3C Patients

load

Race of N3C Patients

load

Sex of N3C Patients

load

Ethnicity of N3C Patients

load

COVID Status of N3C Patients

load

Long COVID Status of N3C Patients

load

Mortality Status of N3C Patients

load

Vaccination Status of N3C Patients

load

Sample: All patients in the N3C Data Enclave. For additional information, see limitations below.

This dashboard contains data both on all patients and only COVID+ patients within the N3C Data Enclave.

 

A COVID-positive patient is defined as any patient having one of the following within their EHR records:

  1. Laboratory confirmed positive COVID-19 PCR or Antigen test
  2. Laboratory confirmed positive COVID-19 Antibody test
  3. Medical visit in which the ICD-10 code for COVID-19 (U07.1) was recorded
    • Condition diagnosis patients have no record of a positive PCR/Antigen or Antibody test within their EHR; however, they were diagnosed with COVID due to the symptoms they displayed.

A Long COVID patient is defined as any patient having the ICD-10 code for PASC (U09.9) within their EHR.

  • Note: The ICD-10 for PASC (U09.9) was not created until October 1, 2021, and any data using this code will be limited to after this date. Therefore, this data is not a full representation of patients diagnosed with Long COVID.

Severity of COVID-19 is a calculation based on multiple events recorded in a patient's EHR during their medical visit. The severity score for each patient may be inaccurate due to missing information within the EHR. Patients will only be graded on Severity if they have a laboratory-confirmed positive PCR or Antigen test. Below are the definitions of each Severity Category.

  • Mild - The patient has no record of Emergency Room visits or hospitalization for COVID-19
  • ED Visit (not admitted) - The patient had an Emergency Room (ER) visit for COVID-19, but we have no record of hospitalization (Inpatient) for COVID-19
  • Moderate Hospitalized - The patient was hospitalized (Inpatient visit) for COVID-19 AND did not receive ECMO OR Invasive Ventilation
  • Mortality - The patient’s records show a date of death
  • Severe Ventilation/ECMO/AKI - The patient was hospitalized for COVID-19 AND received Extracorporeal membrane oxygen (ECMO) OR received Invasive Ventilation
  • Unavailable - Patients who did not have a lab-confirmed positive PCR or Antigen COVID test. This includes patients who were diagnosed only based on the symptoms they displayed or patients who do not have any recorded COVID-19 diagnosis within the Enclave.

Comorbidities for each patient are linked to EHR medical visits coded for any of the 17 different conditions defined by the Charlson Comorbidity Index. A patient may have undiagnosed conditions that would not be recorded in their EHR and, therefore, would not be represented here. Additionally, a patient may have a CCI condition for which they have not required a medical visit, which would exclude them from representation.

The age of each patient is calculated as of the date of the last data update.

  • If an age exceeds 89, it will be obscured using a date shift of +/- 10 years.
  • As of 7/15/22, July 1st is used as a placeholder date of birth when there are 0s or nulls in the OMOP person table to avoid biasing towards older age.

The race and ethnicity of patients are adjusted to standard categories based on self-reported fields within the EHR.

  • Note that the EHRs do not always contain all of the information on race and ethnicity, and patients may not self-report a response that can be fully mapped into one of the standard categories. The patient would fall into the "Unknown" category in these cases.

The sex of patients is determined based on self-reported fields within the EHR.

  • Note that the EHRs do not always contain all of the information on sex; if a patient's EHR does not contain data on their sex, they will fall into the "Unknown" category.
  • If a patient records any response other than "Female" or "Male" they would be mapped into the "Other" category.

Vaccination data in the N3C is sparse and represents only EHR-recorded vaccination events at our data partners. The absence of a vaccination record does not mean that a patient is unvaccinated. If a patient were vaccinated at their local pharmacy, doctor's office, or state/federal vaccination site, they would not be represented because these systems do not automatically link to a patient's EHR.

Given the known national vaccination rates, it is likely that many, if not most, vaccination events are occurring outside of the academic health systems submitting data. Therefore, patients shown here as "Unknown" may be vaccinated; however, we do not have the records to verify this.

Vaccinated patient counts shown here does not mean the patient is fully vaccinated. We consider a vaccinated patient to have at least one dose of Pfizer, Moderna, or Johnson & Johnson COVID-19 vaccines. Given that Pfizer and Moderna require two vaccine doses to be considered fully vaccinated, patients shown here may be partially vaccinated. This same assumption applies to booster shots, as we do not consider shots beyond the first one recorded within the patient's EHR.

Mortality is defined as:

  • Any patient with a date of death in the Enclave
  • (or) Any patient from a mortality-linked PPRL site who exists in one of the external sources:
    • Government Mortality: Government data sourced from death certificates and person-reporting.
    • ObituaryData.com: Obituary data sourced from funeral homes, newspapers, and other online obituary sources, specifically from obituarydata.com (a private obituary aggregator).
    • Private Obituary: Obituary data sourced from funeral homes, newspapers, and other online obituary sources sourced from other private sources.

For external mortality sources:

  • Several mortality sources do not know the exact date of death for all reported deaths. If they know only the month of death, they will provide the date of death as the first day of that month, or if they know only the year of death, they will provide the date of death as the first day of that year. This means that an increased number of deaths will appear at those intervals.
  • Each source has a distinct lag associated with their reported deaths. However, on average, 90% of the deaths that will show up will be in the data by 28 days after the occurrence.

Mortality data should not be considered representative of all deaths in the United States.

Note: This metric is distinct from the Mortality category associated with Severity, as it does not limit deaths to only those suspected to be caused by COVID-19.

General Enclave Limitations

  • “Sicker” patients will likely be overrepresented within the N3C Data Enclave, as sicker patients will more often seek out and receive care at clinical centers.
  • The N3C may have multiple contributors to data “missingness”. Clinical facts and events that occur in the real world may not be captured for reasons including:
    • The event was recorded at a clinical site that does not contribute data
    • Data is not yet linked across sites
    • Medical records are inherently incomplete
  • Some of the external datasets that have been used for analysis cannot be fully mapped due to issues such as missing measurement units.
  • All dates within the Enclave have been shifted between -3 to 45 days to ensure that reidentification is not possible.
  • N3C data may not be representative of the entire US population
    • N3C does NOT have a representative sample of any state, as data is contributed from only a few providers in each region (Region - includes multiple states).
  • Cell sizes smaller than 20 people have been suppressed
  • For COVID+ patients: A patient is only counted once in this data, even if they have multiple positive tests over time. Except in instances where dashboards focus on reinfection, only dates of first infection are utilized.