Tutorial: Quick Annotator for Tubule Segmentation

The manual labeling of large numbers of objects is a frequent occurrence when training deep learning classifiers in the digital histopathology domain. Often this can become extremely tedious and potentially even insurmountable.

To aid people in this annotation process we have developed and released Quick Annotator (QA), a tool which employs a deep learning backend to simultaneously learn and aid the user in the annotation process. A pre-print explaining this tool in more detail is available [here].

It is worth briefly showing the result table from this manuscript, which demonstrates 100x speed improvement (yes, 2 orders of magnitude!) when annotating individual cells versus a common manual approach:

A few links for reference:

Installation instructions (including docker/singularity): quickannotator.com

Associated Wiki: https://github.com/choosehappy/QuickAnnotator/wiki

Along with Frequently Asked Questions: https://github.com/choosehappy/QuickAnnotator/wiki/Frequently-Asked-Questions

And away we go!

Introduction

Here we will provide a brief step-by-step tutorial to demonstrate how QA works to segment objects, in this case tubules. For convenience, we have extracted a few regions of interest from The Cancer Genome Atlas Colon Adenocarcinoma data (TCGA-COAD), which can be downloaded here.

Preparation Steps

1. Download Quick Annotator

Begin by cloning QA from its GitHub URL:

git clone https://github.com/choosehappy/QuickAnnotator.git
2. Verify CUDA version:

One can which CUDA version is installed on their system by running on the command line:

    nvidia-smi

Which shows in the upper right-hand corner the associated CUDA version:

3. Install requirements packages

Depending on the CUDA version installed in your environment (see README.md for more information), either install the python dependencies associated with cuda 10 or cuda 11:

    cd QuickAnnotator    
    cd cuda_11 *OR* cd cuda_10
    pip3 install -r requirements.txt
4. Download sample images

For this tutorial, we’ll be using the provided tubule images available here:

These images were originally downloaded as Whole Slide Images and divided into smaller tiles. Details can be found here and here.

5. Modifying configuration file

The configuration file for QA is located here:  QuickAnnotator\config\config.ini

A few brief points worth discussing are noted below

  • One should pay attention to which gpuid is assigned. In our case, default 0 implies that the first available GPU in the system will be used. This parameter is passed directly to Pytorch, so similar attributes apply (e.g., here)
    [cuda]
    gpuid = 0
  • You have the option to modify numepochs, num_epochs_earlystop, num_min_epochs parameters, which adjust how long the model training takes place for. In general, the defaults are acceptable for small projects, and this tutorial. There are two different processes which take advantage of these properties and are noted in their respective sections: train_ae which trains the baseline autoencoder and train_tl, which performs supervised transfer learning using user-generated annotations.
    [train_ae]
    numepochs = 1000
    num_epochs_earlystop = 100
    num_min_epochs = 300
    numimages = 32
    batchsize = 32 
    patchsize =  ${common:patchsize}
    #-1 implies either 0 for windows, for number of cpus for linux
    numworkers = 0
    
   [train_tl]
    numepochs = 1000
    num_epochs_earlystop = 100
    num_min_epochs = 300
    batchsize = 8
    patchsize =  ${common:patchsize}
    edgeweight = 2
    #-1 implies compute from images, default is .5
    pclass_weight = -1
    #-1 implies either 0 for windows, for number of cpus for linux
    numworkers = 0
    fillbatch = True
  • You may need to change the port which QA runs on, if it is occupied by another process. This is especially common when running QA on a shared server:
    [flask]
    debug = False
    port = 5555
    clear_stale_jobs_at_start = True
    log_level = DEBUG

More advanced usage can be found in the Github Wiki @ https://github.com/choosehappy/QuickAnnotator/wiki.

Starting QA and Uploading Sample Images

6. Open a command terminal and from the QA directory, start QA
    python QA.py

Note: this assumes an installation of QA into the base operating system. Details regarding environment, or docker, setup can be found: https://github.com/choosehappy/QuickAnnotator/blob/main/README.md.

7. Open Chrome, and navigate to
    http://localhost:5555

Where 5555 is the port you defined above in the config.ini file.

Note: the protocol should be HTTP and not HTTPS.

8. Create a new project, add images to the project

Annotating and Training Steps

9. Follow instruction on the page by clicking the buttons in order

Begin by clicking on Make Patches (red arrow), when completed (Re)train Model 0 (yellow arrow) will become enabled:

After the (Re)train Model process is completed, click on Embed Patches followed by View Embedding, which redirects the user to Embedding Page.

10. On the Embedding Page, hovering over a dot shows an ROI and clicking on the dot redirects you to the annotation page of that ROI

In this embedding plot, patches of size 256×256 are represented as dots. The color of the dot corresponds tothe image from which the smaller patch came from. When hovering over a dot, a preview of the patch is shown and subsequently clicking on the dot takes the user to the [annotation page], centered around that patch. A more detailed discussion of how this functionality can help the user to train a robust classifier in a more efficient approach is discussed in the [Best Practice] section of the wiki.

11. Make annotations and upload them to train/test sets

To train QA’s deep learning backend model, a training set is needed. At the same time, to evaluate its performance and prevent poor results (e.g., overfitting) a testing set is needed.

As such, the user is required to annotate at least 1 training set and 1 testing set ROI, with preferably 2 or more training set ROIs.

While we intend to make this process fully automated in the future, one should typically aim for a ratio of 8:2 training vs testing for smaller ROI counts, and can further reduce down to 9:1 for larger ROI counts.

After annotating a ROI, you need to click on the Upload completed annotation (red circle) and then select for this annotation to be added to either the training (blue arrow) or testing (green arrow) sets.

Below is an animated example of making annotations and uploading them to the testing set.

Note that all annotations are saved as PNG image files in the directory:

QuickAnnotator\projects\{project_name}\mask 

12. Training a classifier and viewing its predictions

After making annotations and upload them to the respective training and testing sets, you may click on Retrain DL From base (red arrow) to train a new model.

While the training is taking place, as the blue arrow shows, the prediction result icon is red, indicating that the prediction results or “layer” is not available since the model training is in progress.

This Navigation Bar (green rectangle box) consists of function buttons, layer switches, and status indicators, which help the user understand what is currently available to them. In depth details of each item are available [here].

13. Import DL predictions as starting point for modification

When the DL model is ready, you may generate a prediction mask based on your latest deep learning model. Since there is likely only a very limited amount of annotated training data, the DL model is unlikely to generate a perfect prediction result. However, it is very likely that this output can act as a starting point to greatly improve the efficiency of the annotation process.

With the prediction layer visible, click on the import button (red circle) to import the mask into the user-modifiable space. Essentially this “converts” a white overlay layer into the annotation layer you used to provide previous annotations.

Here is an animated example of that process:

More details regarding the annotation window can be found [here].

14. Iterate as needed to train a good model (Repeat 10-13)

The main purpose of QA is to quickly generate histologic annotations with minimal training data and time. To improve the quality of the model, selecting regions where the DL model is performing poorly is an excellent way to improve the model during the next training step.

For example, in our first round of results, the prediction provides good results on the outside boundaries of the tubules, but contained undesired holes in the middle (red arrows). Hence, the user should focus on providing new regions where such errors occur to improve the model.

During this phase, it is very common for users to switch between viewing Original imagesAnnotation layers, and Prediction layers. Users can do this quickly and easily by using pre-designated shortcut keys rather than clicking on the button in the Navigation bar.

You can use these layer switches to decide what to look at on the annotation page. All switches have related shortcut key in parentheses

  • Image Information (I): the switch for the information box .
  • Annotation (Q): the switch for annotation layer.
  • Prediction (W): the switch for prediction layer.
15. Largely Accept Predictions

When the model is trained sufficiently trained, you should be able to begin largely accepting the prediction results, potentially saving on the order of magnitudes of time. As shown in the animation below, the prediction result is ready to upload to training/testing dataset with minimal modification. A larger crop size can be used to further greatly improve the efficiency of this process

16. Download DL model, output, or annotations

QA also allows you to download various components from QA so that they can be applied in other downstream applications. You may be interested in downloading the annotated image (blue circle), the prediction mask (yellow circle), or the trained model (red circle).

This concludes our brief tutorial on how to use Quick Annotator to annotate large quantities of digital pathology images. Again, the source code, installation instructions and full wiki are available at quickannotator.com.

Of course, if you have any questions or comments, we would love to hear about them! Please submit them to the github issue tracker to help us stay organized

Leave a Reply

Your email address will not be published. Required fields are marked *