MDClone (Synthetic Data Platform)

MDClone is a free, secure, self-service platform for building queries and downloading computationally derived (“synthetic”) data from the institute’s research data core (RDC). Since the data do not contain protected health information (PHI), their use is not classified as human subject research.

Request Help or Services

Figure 1 from Spot the Difference: Comparing Results of Analyses from Real Patient Data and Synthetic Derivatives by Randi E Foraker, Sean C Yu, Aditi Gupta, et al, used under CC-BY (v4.0).

The Institute for Informatics (I2) implemented the MDClone platform in 2018, and conducted a landmark study validating that synthetic data produces the same results as original data while maintaining data privacy. The institute’s significant effort and investment in securing this resource for our campus is part of a strategic goal to accelerate the pace of data-driven research at Washington University in St. Louis.

Our instance of MDClone is designed to meet the specific clinical and translational research needs of researchers at the School of Medicine.

MDClone Login

We host MDClone virtual office hours on Tuesdays from 10 a.m. to 12 p.m. The office hours Zoom link can be found here. Contact Chris Sorensen at with questions.

The I2 scope of support includes:

MDClone data coverage includes:

  • Inpatient and outpatient data from across BJC network facilities.
  • Demographics, encounters, diagnosis, procedure, medication order, lab results, vitals, allergies, surgery, microbiology, social history, problem list, practitioner, and insurance.
  • Data from 1997 to present.

For a more detailed breakdown of the data available in MDClone, please review the MDClone Data Inventory.

I2 publications related to MDClone:

For more information, please contact us at