This blogpost demonstrats how to use QuPath and Python to de-array a tissue microarray (TMA) for computational analysis and was co-authored by Fan Fan (email@example.com)
A Tissue microarray (TMA) is essentially a single slide containing small circular pieces of tissue taken from potentially thousands of different cases (or multiple samples from the same cases). TMAs enable high throughput concurrent analysis of multiple specimens, while reducing overall reactive agent costs, space, and manipulation time. After the first TMAs were introduced in 1986, they have now evolved into becoming a mainstay in the field of pathology.
From a computational perspective, TMAs are useful because they require less scanning time and overall data storage. If important regions are carefully curated to create the TMA, one can have an excellent view of various disease processes without having to download and compute upon hundreds of individual slides.
That said, digital TMAs are still a form of whole slide image, and thus suffer from similar limitations, in terms of requiring special libraries (e.g., openslide) to accesses subsets of the slide which can fix into memory.
A common way of processing TMAs is to first “de-array” them, in which a TMA is split into its constituent “spots” or “disks” and saved as separate image files, usually in a more easily manipulated flat file format (e.g., tif, png).
This de-array process is often non-trivial, as the intended grid that the spots fall on may not be entirely straight, or some spots may be missing or damaged and thus difficult to detect. It remains critical to match the spot to its expected location, so that it can be successfully matched with the associated master clinical spreadsheet (e.g., spot at location A-15 is of a responding cancer patient, while spot at A-14 is of a non-responding patient).
Thankfully, to aid in this process, QuPath provides a very intuitive user interface which allows for the automatic detection and manual refinement of TMA spot location. This blog post discusses the usage of this tool to produce the x,y coordinates of the respective spots, and a subsequent python script which can be used to either directly compute upon these spots or extract and save the images for external use.
We begin with this TMA image:
We first open this svs image in Qupath, and then use the TMA de-array tool to extract each micro spot.
After inputting appropriate parameters (expected number of columns and rows along with the measured TMA core diameter), we can see the initial overlaid grid where some adjustments can be made. For example, here we can see a TMA spot which is absent and we can right click on it and note it as “missing”.
After making the necessary adjustments, we click ‘Measure’ -> ‘show TMA measurements’-> ‘Save’ to save the .txt file.
The txt file contains following information:
- The original image name (e.g., 98263.svs).
- The TMA spot name (e.g., A-1, where A means row, and 1 represents the column).
- Whether the spot is missing (e.g., True or False).
- The coordinate of each spot’s center coordinates (e.g., 1518.8 / 1618.4). Note these are in microns
Now, we have all the information we need to extract each spot using python, with the code available here.
- Import all the packages we need.
2. Set necessary arguments including the prefix name of the WSI file, the txt file and the spot size (in pixels), in this case, we set the spot size 6000 x 6000. The target output directory is where we store all the TMA spots.
3. Note that in Qupath, the size we get is all in micrometers. So, we need to change the size from micrometer to pixel. The ‘ratio_x’ and ‘ratio_y’ represent the pixel per micrometer using the [‘openslide.mpp-x(y)’] properties of OpenSlide.
4. We traverse each row and column from the de-array result and use the center coordinates and the spot size to extract each spot. Note that if the missing part is ‘True’, we will ignore the spot.
Note here as well, if we simply wanted to compute upon the spot instead of writing it to disk, we could simply replace the code on line 46 with the associated algorithms
In the end, we are left with a bunch of high-resolution spots! Here are the TMA de-array result and one example of the TMA spot.
That’s all, thanks! Again the code is available here
8 thoughts on “De-array a Tissue Microarray (TMA) using QuPath and Python”
Clear and useful as always!
Thank you for the script!
Hello, Thanks for the script.
I have a problem the parse fonction, and i you can explain me what do you call “wsi_filename, txt_filename, outdire”
Try not to overthink it : ) wsi_filename is the name of the WSI file. txt_filename is the output from qupath, outdir is the output directory where you want the spots to be saved
Hi, I have a general question: what is WSI file and where do I generate such file?
A WSI is a whole slide image, they are generated by digital slide scanners. Alternatively you can download them from the tcga as described here: Download TCGA Digital Pathology Images (FFPE)
Thanks for the great tutorial! Is there a way to do this and keep the annotations (cells + subcellular spots detected) in the .png files?
i don’t have any code for that, but if you work with this, it should do what you want: http://www.andrewjanowczyk.com/using-paquo-to-directly-interact-with-qupath-project-files-for-usage-in-digital-pathology-machine-learning/ essentially you just need to iterate over your TMA spots, find the annotations which fall within it (you can use shapely intersections), and then write them to the associated image in a qupath project after translating them by the upper left of the spot. all in all, can probably do it within 40 lines of code?