In adding new features to HistoQC , I stumbled upon a very interesting insight that I thought I would take a moment to share. The amount of noise and artifacts in digital pathology (DP) whole slide images (WSI) is far more extensive than I had previously thought.
When looking at a WSI, our brains do a lot of work for us in terms of post-processing the signal which reaches our eyes into the representation of the WSI we have in our consciousness. When building machine learning (and in particular, deep learning) algorithms, it is important to realize that this post-processing work is not done for us automatically.
Let’s make this conversation a bit more concrete. If I were to show you this image, it appears to be a typical immunohistochemically stained image with hematoxylin and DAB for CD20. The background likely appears fairly homogenous to our eye, with no extreme distortions or excessive anomalies.
On the other hand, if we take this same image, convert it from RGB to Grayscale and then apply a very rudimentary histogram equalization approach, which enhances the contrast in the image to make small details more apparent (available in skimage), we can see a strikingly different story:
Importantly, note that the algorithm did not “fabricate” this data, it simply enhanced data which is already present in the image. We can more readily see the tiling artifact produced by the scanning process, as well as more clearly notice coverslip artifacts and dust on the slide.
For example, when looking closer at this region circled in red, we can see that the processed image is actually enhancing a spot which is barely visible on the original image:
These issues appear to be present everywhere, including for example some of my favorite images from the TCGA-BRCA cohort. Here is TCGA-BH-A0HO-01Z-00-DX1.D3D66547-F5D4-40F5-B737-2FECEEB35ACB.svs
There are two points I would like to make here:
- Although it may be difficult for us to visually see these anomalies, both because of contrast of the signal as well as our mental post-processing, this level of specificity is more akin to what our machine learning classifiers are “seeing” and attempting to reconcile.
- Although the anomalies are significantly more visible here in the background region of the slide, they are very much equally present in the tissue regions, though more difficult to visually see due to the presence of tissue.
Questions resulting from these points then become:
- How much do these subtle anomalies impact the performance of our algorithms?
- Are understanding how and why they are created important for experimental design considerations, in particular in the context of batch effect detection?
- Can and should we make efforts to compensate for them?
These are questions I will continue to think about, especially when looking at unexpected output from our deep learning classifiers. Now I will begin to wonder more deeply what I’m not seeing which may ultimately be driving the result being presented to me.
HistoQC v2.1 now has the ability to produce these files for your own review. Simply enable the “LightDarkModule.saveEqualisedImage” module in pipeline!
Would be very happy and interested to hear any thoughts you may have!