This repository is under review for potential modification in compliance with Administrative directives.

Substance Use

Explore the demographics associated with several categories of substance use (smoking, alcohol, opioids, and cannabis) within the Enclave and identify clusters of related substances.

Select a Topic to Explore:

Display

Filters











Data as of Oct 10, 2024 (v185)

Sample: All patients in the N3C Data Enclave who have one or more of the following indicated in their EHR: (1) alcohol-related condition, (2) Opioid, (3) smoking status, (4) cannabis use. Patient EHR records are not always complete. For additional information, see limitations below.

This data contains patients within the Enclave who have one or more of the following indicated in their EHR: (1) alcohol-related condition, (2) Opioid, (3) smoking status, (4) cannabis use.

 

A COVID-positive patient is defined as any patient having one of the following within their EHR records:

  1. Laboratory confirmed positive COVID-19 PCR or Antigen test
  2. Laboratory confirmed positive COVID-19 Antibody test
  3. Medical visit in which the ICD-10 code for COVID-19 (U07.1) was recorded
    • Condition diagnosis patients have no record of a positive PCR/Antigen or Antibody test within their EHR; however, they were diagnosed with COVID due to the symptoms they displayed.

Severity of COVID-19 is a calculation based on multiple events recorded in a patient's EHR during their medical visit. The severity score for each patient may be inaccurate due to missing information within the EHR. Patients will only be graded on Severity if they have a laboratory-confirmed positive PCR or Antigen test. Below are the definitions of each Severity Category.

  • Mild - The patient has no record of Emergency Room visits or hospitalization for COVID-19
  • ED Visit (not admitted) - The patient had an Emergency Room (ER) visit for COVID-19, but we have no record of hospitalization (Inpatient) for COVID-19
  • Moderate Hospitalized - The patient was hospitalized (Inpatient visit) for COVID-19 AND did not receive ECMO OR Invasive Ventilation
  • Mortality - The patient’s records show a date of death
  • Severe Ventilation/ECMO/AKI - The patient was hospitalized for COVID-19 AND received Extracorporeal membrane oxygen (ECMO) OR received Invasive Ventilation
  • Unavailable - Patients who did not have a lab-confirmed positive PCR or Antigen COVID test. This includes patients who were diagnosed only based on the symptoms they displayed or patients who do not have any recorded COVID-19 diagnosis within the Enclave.

The age of each patient is calculated as of the date of the last data update.

  • If an age exceeds 89, it will be obscured using a date shift of +/- 10 years.
  • As of 7/15/22, July 1st is used as a placeholder date of birth when there are 0s or nulls in the OMOP person table to avoid biasing towards older age.

The race and ethnicity of patients are adjusted to standard categories based on self-reported fields within the EHR.

  • Note that the EHRs do not always contain all of the information on race and ethnicity, and patients may not self-report a response that can be fully mapped into one of the standard categories. The patient would fall into the "Unknown" category in these cases.

The sex of patients is determined based on self-reported fields within the EHR.

  • Note that the EHRs do not always contain all of the information on sex; if a patient's EHR does not contain data on their sex, they will fall into the "Unknown" category.
  • If a patient records any response other than "Female" or "Male" they would be mapped into the "Other" category.

Mortality is defined as:

  • Any patient with a date of death in the Enclave
  • (or) Any patient from a mortality-linked PPRL site who exists in one of the external sources:
    • Government Mortality: Government data sourced from death certificates and person-reporting.
    • ObituaryData.com: Obituary data sourced from funeral homes, newspapers, and other online obituary sources, specifically from obituarydata.com (a private obituary aggregator).
    • Private Obituary: Obituary data sourced from funeral homes, newspapers, and other online obituary sources sourced from other private sources.

For external mortality sources:

  • Several mortality sources do not know the exact date of death for all reported deaths. If they know only the month of death, they will provide the date of death as the first day of that month, or if they know only the year of death, they will provide the date of death as the first day of that year. This means that an increased number of deaths will appear at those intervals.
  • Each source has a distinct lag associated with their reported deaths. However, on average, 90% of the deaths that will show up will be in the data by 28 days after the occurrence.

Mortality data should not be considered representative of all deaths in the United States.

Note: This metric is distinct from the Mortality category associated with Severity, as it does not limit deaths to only those suspected to be caused by COVID-19.

Smoking status for this patient population was obtained from EHRs; however, patient EHRs are not always complete. Therefore, there is no guarantee that the reported smoking status of a patient is accurate. The following definitions were applied:

  • Never - The patient's EHR recorded Never smoked tobacco
  • Current or Former - The patient's EHR recorded any past or current smoking status
  • Unknown - The patient's EHR recorded "unknown" smoking status

General Enclave Limitations

  • “Sicker” patients will likely be overrepresented within the N3C Data Enclave, as sicker patients will more often seek out and receive care at clinical centers.
  • The N3C may have multiple contributors to data “missingness”. Clinical facts and events that occur in the real world may not be captured for reasons including:
    • The event was recorded at a clinical site that does not contribute data
    • Data is not yet linked across sites
    • Medical records are inherently incomplete
  • Some of the external datasets that have been used for analysis cannot be fully mapped due to issues such as missing measurement units.
  • All dates within the Enclave have been shifted between -3 to 45 days to ensure that reidentification is not possible.
  • N3C data may not be representative of the entire US population
    • N3C does NOT have a representative sample of any state, as data is contributed from only a few providers in each region (Region - includes multiple states).
  • Cell sizes smaller than 20 people have been suppressed
  • For COVID+ patients: A patient is only counted once in this data, even if they have multiple positive tests over time. Except in instances where dashboards focus on reinfection, only dates of first infection are utilized.