MDClone is a free, secure, self-service platform for building queries and downloading computationally derived (“synthetic”) data from the institute’s research data core (RDC). Since the data do not contain protected health information (PHI), their use is not classified as human subject research.
Figure 1 from Spot the Difference: Comparing Results of Analyses from Real Patient Data and Synthetic Derivatives by Randi E Foraker, Sean C Yu, Aditi Gupta, et al, used under CC-BY (v4.0). doi.org/10.1093/jamiaopen/ooaa060
The Institute for Informatics (I2) implemented the MDClone platform in 2018, and conducted a landmark study validating that synthetic data produces the same results as original data while maintaining data privacy. The institute’s significant effort and investment in securing this resource for our campus is part of a strategic goal to accelerate the pace of data-driven research at Washington University in St. Louis.
Our instance of MDClone is designed to meet the specific clinical and translational research needs of researchers at the School of Medicine.
The I2 scope of support includes:
- Online training and onboarding to the platform.
- Demonstration videos.
- Beginner and advanced classes via Bernard Becker Medical Library. REGISTER
- Data brokerage services for downloading the original data (requires IRB approval).
MDClone data coverage includes:
- Inpatient and outpatient data from across BJC network facilities.
- Demographics, encounters, diagnosis, procedure, medication order, lab results, vitals, allergies, surgery, microbiology, social history, problem list, practitioner, and insurance.
- Data from 1997 to present.
For a more detailed breakdown of the data available in MDClone, please review the MDClone Data Inventory.
I2 publications related to MDClone:
- Spot the Difference: Comparing Results of Analyses from Real Patient Data and Synthetic Derivatives. JAMIA Open
- The Use of Synthetic Electronic Health Record Data and Deep Learning to Improve Timing of High-Risk Heart Failure Surgical Intervention by Predicting Proximity to Catastrophic Decompensation. Frontiers in Digital Health
- Predicting Mortality among Patients with Liver Cirrhosis in Electronic Health Records with Machine Learning. PLoS One
- The National COVID Cohort Collaborative: Analyses of Original and Computationally Derived Electronic Health Record Data. Journal of Medical Internet Research
You are invited to Becker Library webinars and office hours to learn more about accessing synthetic healthcare data via MDClone. A recording of the “Introduction to MDClone” webinar can be viewed in Box. We host MDClone virtual office hours on Tuesdays from 10 a.m. to 12 p.m. You can join to ask specific questions or watch demonstrations of tips and tricks to overcome common issues in MDClone. The office hours Zoom link can be viewed in Box. You will need your WUSTL Key and password to access the Box links. Contact Chris Sorensen at firstname.lastname@example.org with questions.
For more information, please contact us at email@example.com.