Our paper is out in: Journal of Clinical Oncology: Clinical Cancer Informatics
Purpose: Digital pathology (DP), referring to the digitization of tissue slides, is beginning to change the landscape of clinical diagnostic workflows and has engendered active research within the area of computational pathology. One of the challenges in DP is the presence of artifacts and batch effects; unintentionally introduced during both routine slide preparation (e.g., staining, tissue folding, etc.) as well as digitization (e.g., blurriness, variations in contrast and hue). Manual review of glass and digital slides is laborious, qualitative, and subject to intra/inter-reader variability. There is thus a critical need for a reproducible automated approach of precisely localizing artifacts in order to identify slides which need to be reproduced or regions which should be avoided during computational analysis.
Methods: Here we present HistoQC, a tool for rapidly performing quality control to not only identify and delineate artifacts but also discover cohort level “outliers” (e.g., slides stained darker/lighter than other slides in the cohort). This open-source tool employs a combination of image metrics (e.g., color histograms, brightness, contrast), features (e.g., edge detectors), and supervised classifiers (e.g., pen detection) to identify artifact free regions on digitized slides. These regions and metrics are presented to the user via an interactive graphical user interface, facilitating artifact detection through real-time visualization and filtering. These same metrics afford users the opportunity to explicitly define acceptable tolerances for their workflows.
Results: HistoQC’s output on n=450 slides from The Cancer Genome Atlas (TCGA) was reviewed by 2 pathologists and found to be suitable for computational analysis over 95% of the time.
Conclusion: These results suggest that HistoQC could provide an automated, quantifiable, quality control process for identifying artifacts and measuring slide quality, in turn helping to improve both the repeatability and robustness of DP workflows.
Manuscript available here: HistoQC_w_supplemental
Code available here: HistoQC Github repo
Wiki available here: HistoQC Wiki
More in-depth tutorial to follow!