ORIGINAL ARTICLE |
|
Year : 2021 | Volume
: 12
| Issue : 1 | Page : 54 |
|
Stress testing pathology models with generated artifacts
Nicholas Chandler Wang1, Jeremy Kaplan1, Joonsang Lee1, Jeffrey Hodgin2, Aaron Udager2, Arvind Rao1
1 Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA 2 Department of Pathology, University of Michigan Medical School, Ann Arbor, MI, USA
Correspondence Address:
Dr. Arvind Rao 100 Washtenaw Ave, Room 2305, Ann Arbor, MI 48109 USA
 Source of Support: None, Conflict of Interest: None  | Check |
DOI: 10.4103/jpi.jpi_6_21
|
|
Background: Machine learning models provide significant opportunities for improvement in health care, but their “black-box” nature poses many risks. Methods: We built a custom Python module as part of a framework for generating artifacts that are meant to be tunable and describable to allow for future testing needs. We conducted an analysis of a previously published digital pathology classification model and an internally developed kidney tissue segmentation model, utilizing a variety of generated artifacts including testing their effects. The artifacts simulated were bubbles, tissue folds, uneven illumination, marker lines, uneven sectioning, altered staining, and tissue tears. Results: We found that there is some performance degradation on the tiles with artifacts, particularly with altered stains but also with marker lines, tissue folds, and uneven sectioning. We also found that the response of deep learning models to artifacts could be nonlinear. Conclusions: Generated artifacts can provide a useful tool for testing and building trust in machine learning models by understanding where these models might fail.
|
|
|
|
[FULL TEXT] [PDF]* |
|
 |
|