Tag Archives: python

Approach for Easy Visual Comparison between ground-truth and predicted classes

Although classification metrics are good for summarizing a model’s performance on a dataset, they disconnect the user from the data itself. Similarly, a confusion matrix might tell us that performance is suffering because of false positives, but it obscures information about what patterns may have caused those misclassifications and what types of false positives there might be. 

One way to gain interpretability is to group sampled images by the category of their output (true negative, false negative, false positive, true positive), and display them in a powerpoint file for facile review. These visualizable categories make it easy to identify patterns in misclassified data that can be exploited to improve performance (e.g., hard negative mining, or image analysis based filtering).

This blog post describes and demonstrates a workflow that produces such a powerpoint slide deck automatically for review, as shown below:

Continue reading Approach for Easy Visual Comparison between ground-truth and predicted classes

Application of ICC profiles to digital pathology images

Background on Color Calibration

Digital whole slide image scanners are designed to take stained tissue on glass slides and digitize them into bytes for usage in the digital world. The process by which slide scanners perform this operation does not produce a perfect digital equivalent of the original slide as the hardware involved (led/blub, camera sensor, quantizer) can introduce some biases during the sampling process. For example, different camera sensors may detect colors with different levels of specificity/accuracy/density, resulting in similar but not perfect representations of the associated real-world subjects.

Concretely, there is often a difference between the color you perceive in the real-world under a microscope versus what you would see if you looked at the corresponding digital copy of the same slide. This blog post discusses how to correct for this discrepancy using ICC profiles.

Continue reading Application of ICC profiles to digital pathology images

Using Paquo to directly interact with QuPath project files for usage in digital pathology machine learning

This is an updated version of the previously described workflow on how to load and classify annotations/detections created in QuPath for usage in downstream machine learning workflows. The original post described how to use the Groovy programming language used by QuPath to export annotations/detections as GeoJSON from within QuPath, made use of a Python script to classify them, and lastly used another Groovy script to reimport them. If you are not familiar with QuPath and/or its annotations you should probably read the original post first to provide better context and understanding of the respective workflows, as well as being able to appreciate the more elegant approach taken here. If you are already using the described approach, you should be able to easily modify it to follow this newer approach.

Continue reading Using Paquo to directly interact with QuPath project files for usage in digital pathology machine learning

Converting an existing image into an Openslide compatible format

Many digital pathology tools (e.g., our quality control tool, HistoQC), employ Openslide, a library for reading whole slide images (WSI).  Openslide provides a reliable abstraction away from a number of proprietary WSI file-formats, such that a single programmatic interface can be employed to access WSI meta and image data.

Unfortunately, when smaller regions of interest, or new images, are created in tif/png/jpg formats they no longer remain compatible with OpenSlide. This blog post discusses how to take any image and convert it into an OpenSlide compatible WSI, with embedded metadata.

Continue reading Converting an existing image into an Openslide compatible format

Computationally creating a PowerPoint presentation of experimental results using Python

This post is an update of the previous post, which discussed how to create a powerpoint slide desk with results using Matlab. In the last couple of years, we have mostly transitioned to python for our digital pathology image analysis, in particular those tasks which employ deep learning. It thus makes sense to port our tools over as well. In this case, we’ll be looking at building powerpoint slide desks using python.

Let’s look at what we want as our final output:

Continue reading Computationally creating a PowerPoint presentation of experimental results using Python

Employing the albumentation library in PyTorch workflows. Bonus: Helper for selecting appropriate values!

This brief blog post sees a modified release of the previous segmentation and classification pipelines. These versions leverage an increasingly popular augmentation library called albumentations.

ablumentation_view

Continue reading Employing the albumentation library in PyTorch workflows. Bonus: Helper for selecting appropriate values!

Image popups on mouse over in Jupyter Notebooks

Animation below speaks for itself : )

Finally put together a script which makes jupyter notebooks plots interactive, such that when hovering over a scatter point plot, the underlying image displays, see demo + code below:

Very useful when looking at e.g. embeddings.
If the dataset is too large to store in memory, line 70 can be replaced with a real-time load command

image_popup_on_hover

 

Code is available here: https://github.com/choosehappy/Snippets/blob/master/interactive_image_popup_on_hover.py

Digital Pathology Segmentation using Pytorch + Unet

In this blog post, we discuss how to train a U-net style deep learning classifier, using Pytorch, for segmenting epithelium versus stroma regions. This post is broken down into 4 components following along other pipeline approaches we’ve discussed in the past:

  1. Making training/testing databases,
  2. Training a model,
  3. Visualizing results in the validation set,
  4. Generating output.

This model focuses on using solely Python and freely available tools (i.e., no matlab).

This blog post assumes moderate knowledge of convolutional neural networks, depending on the readers background, our JPI paper may be sufficient, or a more thorough resource such as Andrew NG’s deep learning course.

Continue reading Digital Pathology Segmentation using Pytorch + Unet

Using Matlab, Pytables (hdf5) and (a bit of) Pytorch

As we’re testing out for migration to new deep learning frameworks, one of the questions that remained was dataset interoperability. Essentially, we want to be able to create a dataset for training a deep learning framework from as many applications as possible (python, matlab, R, etc), so that our students can use a language that are familiar to them, as well as leverage all of the existing in-house code we have for data manipulation.

Continue reading Using Matlab, Pytables (hdf5) and (a bit of) Pytorch