Synthetic Data Workstream

Mission

Use a variety of metrics to compare and validate the quality of generated synthetic data. Derive synthetic data from the harmonized electronic health record (EHR) data. Access to synthetic data requires creation of an N3C Data Enclave account, a signed Data Use Agreement (DUA), and submission of a data use request (DUR). The dataset is open to the broad research community, including domestic and international investigators and citizen scientists.

View workstream details in the Github Repository.

 

Register for Meetings
Tuesdays 2:30 pm PT/5:30 pm ET

Connect with Us:

  1. Onboard to N3C using the link below.
    • In there you will provide your email address. We will add that email address to the CD2H workspace.
  2. Go to our workstream Slack channel directly using the link provided below.
    • Login with your Slack credentials.
Synthetic Data Icon

FAQs

The N3C platform produces synthetic data from the limited dataset (LDS) that a site submits. Comparisons between source limited data and ensuing synthetic data are an essential component of the data quality assurance, verification, and validation processes used by N3C. Therefore, sites are required to submit an LDS to N3C in order to create a synthetic dataset. (See the NCATS webpage for more details on the levels of data access.)

Leadership and Administration

Philip Payne headshot

Washington University in St. Louis

Workstream Lead
Atul Butte Headshot

University of California, San Francisco

Workstream Lead
Tom Dillon Headshot

Washington University in St. Louis

Project Manager
Andrew Neumann Headshot
Andrew Neumann, BA

Oregon State University

Project Manager

Synthetic Data Task Teams

Group Name

Mailing List

Mailing List address

Drive/Notes Link

Synthetic Data Privacy

Join

n3c-tt-synthetic-privacy@googlegroups.com

Drive

Synthetic Data Validation

Join

n3c-tt-synthetic-validation@googlegroups.com

Drive

Synthetic Platform Architecture

Join

n3c-tt-synthetic-platform@googlegroups.com