All posts by choosehappy

Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 3 Intro to Serving Models

Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (2 on this topic, basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!

This is the second to last blog post in the series, (the first one here, second one here), where we will go into greater detail about how we can use Ray Serve to set up a server waiting to respond to our requests for processing. These last two are the most complex blogpost in the series and require some understanding of how HTTP, REST, and web services work. You can find relevant prereading here.

Ray Serve is a scalable model serving library for building online inference APIs. Serve is framework agnostic, so you can use a single toolkit to serve everything from deep learning models built with frameworks like PyTorch, Tensorflow, and Keras, to Scikit-Learn models, to arbitrary Python business logic.

Continue reading Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 3 Intro to Serving Models

Ray: An Open-Source API For Easy, Scalable Distributed Computing In Python – Part 2 Distributed Scaling

Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!

This is the second blog post in the series, (the first one here), where we will go into greater detail about how Ray Cluster creation works, associated terminology, requirements for successful execution, and extend our previous local-only example to a distributed environment.

Continue reading Ray: An Open-Source API For Easy, Scalable Distributed Computing In Python – Part 2 Distributed Scaling

Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 1 Local Scaling

Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!

I am personally very excited by the opportunities afforded by Ray, its been a long time desire to have such an easy-to-use library!

Okay, lets start off by talking about scaling local computation with Ray!

Continue reading Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 1 Local Scaling

Using QuPath To Help Identify An Optimal Threshold For A Deep Or Machine Learning Classifier

Digital pathology projects often require assigning a class to cells/objects. For example, you may have a segmentation of cells/glomeruli/tubules and want to identify the ones which are lymphocytes/sclerotic/distal. This classification process can be done using machine or deep learning classifiers by supplying the object of question and receiving an output score which indicates the likelihood that that particular object is of that particular type.

This blog post will demonstrate an efficient way of using QuPath to help find the ideal likelihood threshold for your classifier.

Continue reading Using QuPath To Help Identify An Optimal Threshold For A Deep Or Machine Learning Classifier

Application of ICC profiles to digital pathology images

Background on Color Calibration

Digital whole slide image scanners are designed to take stained tissue on glass slides and digitize them into bytes for usage in the digital world. The process by which slide scanners perform this operation does not produce a perfect digital equivalent of the original slide as the hardware involved (led/blub, camera sensor, quantizer) can introduce some biases during the sampling process. For example, different camera sensors may detect colors with different levels of specificity/accuracy/density, resulting in similar but not perfect representations of the associated real-world subjects.

Concretely, there is often a difference between the color you perceive in the real-world under a microscope versus what you would see if you looked at the corresponding digital copy of the same slide. This blog post discusses how to correct for this discrepancy using ICC profiles.

Continue reading Application of ICC profiles to digital pathology images

Tutorial: Quick Annotator for Tubule Segmentation

The manual labeling of large numbers of objects is a frequent occurrence when training deep learning classifiers in the digital histopathology domain. Often this can become extremely tedious and potentially even insurmountable.

To aid people in this annotation process we have developed and released Quick Annotator (QA), a tool which employs a deep learning backend to simultaneously learn and aid the user in the annotation process. A pre-print explaining this tool in more detail is available [here].

Continue reading Tutorial: Quick Annotator for Tubule Segmentation

Transferring data FASTER to the GPU With Compression

Utilization of current GPUs is often limited by the ability to get the data onto and off the device quickly. More precisely, this means taking data from the host RAM, transferring it over the PCI-e bus to the GPU RAM is the bottleneck of many deep learning use cases.

Continue reading Transferring data FASTER to the GPU With Compression

The noise in our digital pathology slides

In adding new features to HistoQC , I stumbled upon a very interesting insight that I thought I would take a moment to share. The amount of noise and artifacts in digital pathology (DP) whole slide images (WSI) is far more extensive than I had previously thought.

Continue reading The noise in our digital pathology slides

Converting an existing image into an Openslide compatible format

Many digital pathology tools (e.g., our quality control tool, HistoQC), employ Openslide, a library for reading whole slide images (WSI).  Openslide provides a reliable abstraction away from a number of proprietary WSI file-formats, such that a single programmatic interface can be employed to access WSI meta and image data.

Unfortunately, when smaller regions of interest, or new images, are created in tif/png/jpg formats they no longer remain compatible with OpenSlide. This blog post discusses how to take any image and convert it into an OpenSlide compatible WSI, with embedded metadata.

Continue reading Converting an existing image into an Openslide compatible format