MDClone (Synthetic Data Platform)

MDClone is a free, secure, self-service platform for building queries and downloading computationally derived (“synthetic”) data from the institute’s research data core (RDC). Since the data do not contain protected health information (PHI), their use is not classified as human subject research.

Request Help or Services

Figure 1 from Spot the Difference: Comparing Results of Analyses from Real Patient Data and Synthetic Derivatives by Randi E Foraker, Sean C Yu, Aditi Gupta, et al, used under CC-BY (v4.0). doi.org/10.1093/jamiaopen/ooaa060

The Institute for Informatics (I2) implemented the MDClone platform in 2018, and conducted a landmark study validating that synthetic data produces the same results as original data while maintaining data privacy. The institute’s significant effort and investment in securing this resource for our campus is part of a strategic goal to accelerate the pace of data-driven research at Washington University in St. Louis.

Our instance of MDClone is designed to meet the specific clinical and translational research needs of researchers at the School of Medicine.

The I2 scope of support includes:

MDClone data coverage includes:

  • Inpatient and outpatient data from across BJC network facilities.
  • Demographics, encounters, diagnosis, procedure, medication order, lab results, vitals, allergies, surgery, microbiology, social history, problem list, practitioner, and insurance.
  • Data from 1997 to present.

I2 publications related to MDClone:

For more information, please contact us at i2help@wustl.edu.