Journal of Pathology Informatics Journal of Pathology Informatics
Contact us | Home | Login   |  Users Online: 622  Print this pageEmail this pageSmall font sizeDefault font sizeIncrease font size 


RESEARCH ARTICLE
Year : 2022  |  Volume : 13  |  Issue : 1  |  Page : 10

Prediction of tuberculosis using an automated machine learning platform for models trained on synthetic data


1 Department of Pathology and Laboratory Medicine, University of California, Davis, School of Medicine, Sacramento, California, United States of America
2 Amazon Web Services, Seattle, Washington, United States of America
3 UC Davis Health, Sacramento, California, United States of America

Correspondence Address:
Dr. Hooman H Rashidi
Dept. of Pathology and Laboratory Medicine, University of California Davis, 4400 V St., Sacramento 95817.
United States of America
Login to access the Email id

Source of Support: None, Conflict of Interest: None


DOI: 10.4103/jpi.jpi_75_21

Rights and Permissions

High-quality medical data is critical to the development and implementation of machine learning (ML) algorithms in healthcare; however, security, and privacy concerns continue to limit access. We sought to determine the utility of “synthetic data” in training ML algorithms for the detection of tuberculosis (TB) from inflammatory biomarker profiles. A retrospective dataset (A) comprised of 278 patients was used to generate synthetic datasets (B, C, and D) for training models prior to secondary validation on a generalization dataset. ML models trained and validated on the Dataset A (real) demonstrated an accuracy of 90%, a sensitivity of 89% (95% CI, 83–94%), and a specificity of 100% (95% CI, 81–100%). Models trained using the optimal synthetic dataset B showed an accuracy of 91%, a sensitivity of 93% (95% CI, 87–96%), and a specificity of 77% (95% CI, 50–93%). Synthetic datasets C and D displayed diminished performance measures (respective accuracies of 71% and 54%). This pilot study highlights the promise of synthetic data as an expedited means for ML algorithm development.


[FULL TEXT] [PDF]*
Print this article     Email this article
 Next article
 Previous article
 Table of Contents

 Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
 Citation Manager
 Access Statistics
 Reader Comments
 Email Alert *
 Add to My List *
 * Requires registration (Free)
 

 Article Access Statistics
    Viewed2799    
    Printed40    
    Emailed0    
    PDF Downloaded292    
    Comments [Add]    

Recommend this journal