Our paper is out in: Journal of Clinical Oncology: Clinical Cancer Informatics
Purpose: Digital pathology (DP), referring to the digitization of tissue slides, is beginning to change the landscape of clinical diagnostic workflows and has engendered active research within the area of computational pathology. One of the challenges in DP is the presence of artifacts and batch effects; unintentionally introduced during both routine slide preparation (e.g., staining, tissue folding, etc.) as well as digitization (e.g., blurriness, variations in contrast and hue). Manual review of glass and digital slides is laborious, qualitative, and subject to intra/inter-reader variability. There is thus a critical need for a reproducible automated approach of precisely localizing artifacts in order to identify slides which need to be reproduced or regions which should be avoided during computational analysis.
In this blog post, we discuss how to train a DenseNet style deep learning classifier, using Pytorch, for differentiating between different types of lymphoma cancer. This post and code are based on the post discussing segmentation using U-Net and is thus broken down into the same 4 components:
In this blog post, we discuss how to train a U-net style deep learning classifier, using Pytorch, for segmenting epithelium versus stroma regions. This post is broken down into 4 components following along other pipeline approaches we’ve discussed in the past:
This model focuses on using solely Python and freely available tools (i.e., no matlab).
This blog post assumes moderate knowledge of convolutional neural networks, depending on the readers background, our JPI paper may be sufficient, or a more thorough resource such as Andrew NG’s deep learning course.
Digital pathology image analysis requires high quality input images. While there are a large number of images available in The Cancer Genome Atlas (TCGA), the ones which are currently available in the data portal are frozen specimens and are *not* suitable for computational analysis. This post discusses how to download the Formalin-Fixed Paraffin-Embedded (FFPE) slides for corresponding patients.
Just wanted to take a moment and share some quick stain normalization type experimental results. We have a trained in-house nuclei segmentation model which works fairly well when the test images have similar stain presentation properties, but when new datasets arrive which are notably different we tend to see a decreased classifier performance.
Here we look at one of these images and ways of improving classifier robustness.
One of the common ways of increasing the size of a training set is to augment the original data with a set of modified patches. These modifications often include (a) rotations, (b) mirroring, (c) lighting adjustment, (d) affine transformations (sheering, etc), (e) magnification modification, (f) addition of noise, etc. This blog post discusses how to do the most trivial modification, rotation, in real-time using a python layer through Nvidia Digits. Given this code, it should be easy to add on other desired augmentations.
One of the challenges in working in digital pathology is that the associated images can be excessively large, too large to load fully into memory, as well as too large to use in common pipelines. For example, a Aperio SVS file that we’ll look at today is 60,000 x 42,600 pixels. If we tried to load such an image, in RGB space, uncompressed it would require ~7GB, making it too large to consider using in our deep learning pipelines as there wouldn’t be enough RAM on the GPU for both the data and the filter activations.