Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!
This is the second blog post in the series, (the first one here), where we will go into greater detail about how Ray Cluster creation works, associated terminology, requirements for successful execution, and extend our previous local-only example to a distributed environment.
Continue reading Ray: An Open-Source API For Easy, Scalable Distributed Computing In Python – Part 2 Distributed Scaling
Introduction: Another day, another
applications for jobs, grants and all manner of other reviews is a
continual process within the scientific World. Forms tend to ask for
specific, nuanced information leading to more of our precious time
being spent digging up decades-worth of buried events just to
evidence ‘A time I have communicated with a diverse audience’
than actually writing. Then, we have the doubt to contend with: What
if I missed something? Surely I have a better example! I remember
doing that – but when was it?
A) how short academic contracts can be and B) how many distinct
workplaces our generation tends to work in over the course of a
career, writing CVs can consume a considerable chunk of our adult
lives. The application process is not going anywhere in the near
future. We need to ask ourselves how we can make it as painless and
efficient as possible.
Well, there are a few ‘hacks’. Apply for a few jobs and you will start to notice themes in the application process and in the ‘winning’ CVs. Let’s go over these themes and learn to not only ‘hack’ our time but more importantly, our success rate. Doing so, we can earn back so much more time to do the things we love – science!
Continue reading A masterclass in Scientific CV writing
Utilization of current GPUs is often limited by the ability to get the data onto and off the device quickly. More precisely, this means taking data from the host RAM, transferring it over the PCI-e bus to the GPU RAM is the bottleneck of many deep learning use cases.
Continue reading Transferring data FASTER to the GPU With Compression
Many digital pathology tools (e.g., our quality control tool, HistoQC), employ Openslide, a library for reading whole slide images (WSI). Openslide provides a reliable abstraction away from a number of proprietary WSI file-formats, such that a single programmatic interface can be employed to access WSI meta and image data.
Unfortunately, when smaller regions of interest, or new images, are created in tif/png/jpg formats they no longer remain compatible with OpenSlide. This blog post discusses how to take any image and convert it into an OpenSlide compatible WSI, with embedded metadata.
Continue reading Converting an existing image into an Openslide compatible format
Update-Nov 2020: Code has now been placed in github which enables the reading and writing of compressed geojson files at all stages of the process described below. Compression reduces the file size by approximately 93% : )
QuPath is a digital pathology tool that has become especially popular because it is both easy to use to and supports a large number of different whole slide image (WSI) file formats. QuPath is also able to perform a number of relevant analytical functions with a few mouse clicks. Of interest in this blog post is mentioning that the pathologists we tend to work with are either already familiar with QuPath, or find it easier to learn versus other tools. As a result, QuPath has become a goto tool for us for both the creation, and review of, annotations and outputs created by our algorithms.
Here we introduce a robust method using GeoJSON for exporting annotations (or cell objects) from QuPath, importing them into python as shapely objects, operating upon them, and then re-importing a modified version of them back into QuPath for downstream usage or review. As an example use case we will be looking at computationally identifying lymphocytes in WSIs of melanoma metastases using a deep learning classifier.
Continue reading Exporting and re-importing annotations from QuPath for usage in machine learning
Animation below speaks for itself : )
Finally put together a script which makes jupyter notebooks plots interactive, such that when hovering over a scatter point plot, the underlying image displays, see demo + code below:
Very useful when looking at e.g. embeddings.
If the dataset is too large to store in memory, line 70 can be replaced with a real-time load command
Code is available here: https://github.com/choosehappy/Snippets/blob/master/interactive_image_popup_on_hover.py