Computationally creating a PowerPoint presentation of experimental results using Python

This post is an update of the previous post, which discussed how to create a powerpoint slide desk with results using Matlab. In the last couple of years, we have mostly transitioned to python for our digital pathology image analysis, in particular those tasks which employ deep learning. It thus makes sense to port our tools over as well. In this case, we’ll be looking at building powerpoint slide desks using python.

Let’s look at what we want as our final output:

Essentially, we can see that there are 5 pieces of information that we would like to display:

  1. Original input image
  2. Ground truth (GT)
  3. Algorithmic output (AO)
  4. Overlay of Algorithmic output with ground truth (white match, pink GT only, green AO only)
  5. Metadata indicating filename, and potentially other metrics of interest (accuracy, f-score, etc)

A consolidated powerpoint presentation is an excellent way to view and share results with collaborators, as it pulls all relevant images and metadata into a single location using a common tool people are comfortable with. Often times, our students forget that they are much closer to the data than others, and inherently “know” what the original image and GT are, while other folks would benefit from having all that information present.

The “Click to add notes” sections further allows for easy commenting on particular images without additional effort or stand-alone tools.

To generate these decks, we’ll be employing the library “pptx”. The full manual of which is available here: https://python-pptx.readthedocs.io/en/latest/

Installation is super simple using pip:

pip install python-pptx

And then we’re ready to go.

The code to produce the above output is available here, a python script written in jupytext format for easy usage either via the command line or via a jupyter notebook. As a use case, we will again be using the epithelium/stroma segmentation use case presented here. This tutorial assumes that you have output create with its result generation script, but can easily be modified for any other set of images.

Lets take a quick walk through the relevant parts of the code.

Firstly we important the pptx library as well as “Inches” which will give us precise control over where in the slide we would like to place our images:

  1. from pptx import Presentation
  2. from pptx.util import Inches

I like to add a first slide with some experimental data, essentially, name, date, use case, and other notes which may be of interest to put the results into a better context. We place this information near the top of the file so it is easy to find and change as different versions of results are generated.

  1. # -- Set meta data which will appear on first slide
  2. title = "Epi/stroma segmentation"
  3. date = datetime.today()
  4. author = "Andrew Janowczyk"
  5. comments = "data and code taken from blog andrewjanowczyk.com "
  6. pptxfname = "epistroma_results.pptx"

Next we need to create the presentation object which will hold all of our slides. Since the slides are rectangular to begin with, we modify their shape to be 10 inches squared for better presentation. We then add a blank slide to our presentation

  1. prs = Presentation()
  2. prs.slide_width = Inches(10)
  3. prs.slide_height = Inches(10)
  4.  
  5. blank_slide_layout = prs.slide_layouts[1]
  6. slide = prs.slides.add_slide(blank_slide_layout)

And then finally place the experimental metadata we discussed before on this new first slide:

  1. #make first slide with our metadata
  2. slide.placeholders[0].text = title
  3.  
  4. tf = slide.placeholders[1].text_frame
  5. tf.text = f'Date: {date}\n'
  6. tf.text += f"Author: {author}\n"
  7. tf.text += f"Comments: {comments}\n"

Here we can see that we access placeholders (blank boxes) that typically appear in a powerpoint presentation’s new slide, and simply add our text to those components. In regards to the metadata, since this placeholder is a bulleted list, we can add new line characters to force each additional line to appear as a new bullet. Simple as that!

We define two useful helper functions. First a blend image helper function:

  1. #helper function to blend two images
  2. def blend2Images(img, mask):
  3.     if (img.ndim == 3):
  4.         img = color.rgb2gray(img)
  5.     if (mask.ndim == 3):
  6.         mask = color.rgb2gray(mask)
  7.     img = img[:, :, None] * 1.0  # can't use boolean
  8.     mask = mask[:, :, None] * 1.0
  9.     out = np.concatenate((mask, img, mask), 2) * 255
  10.     return out.astype('uint8')

Which will create an overlay of two images by merging the second image into the Red and Blue channels, while place the first image in the Green channel. In the case of employing this function for overlaying the ground truth and the algorithm output, we would see the color “white” when they are in agreement, “red” when the pixel is positive in only the mask, and “green” when the pixel is positive only in the ground truth.

The second helper function we define is one which allows us to add images easily to slides:

  1. #wrapper function to add an image as a byte stream to a slide
  2. #note that this is in place of having to save output directly to disk, and can be used in dynamic settings as well
  3. def addimagetoslide(slide,img,left,top, height, width, resize = .1):
  4.     res = cv2.resize(img , None, fx=resize,fy=resize ,interpolation=cv2.INTER_CUBIC) #since the images are going to be small, we can resize them to prevent the final pptx file from being large for no reason
  5.     image_stream = BytesIO()
  6.     Image.fromarray(res).save(image_stream,format="PNG")
  7.  
  8.     pic = slide.shapes.add_picture(image_stream, left, top ,height,width)
  9.     image_stream.close()

The easier way to include images in a powerpoint is to save the image directly to disk and then provide the filename to the pptx library. This creates an awkward situation where there is no way to add images we create on the fly. This function works around this issue by accepting an image as a 3 channel matrix, and creates a PNG compressed bit stream of this image, which can be added to the slide.

Note as well that we have the option to resize the image. Usually, our regions of interest images are quite large (e.g., 2,000 x 2,000), and as such it doesn’t make sense to add them in full size to a smaller powerpoint slide. As a result, we can resize the image, so that it more closely matches the resolution of the powerpoint slide, drastically reducing our final file size.

Now for each result, we simply create a new blank slide, modify the mask file name to match that of the original file and the ground truth files, load them, and add them to the slide in their respective quadrant:

  1. for mask_fname in tqdm(mask_files):
  2.    
  3.     #add a new slide for this set of images
  4.     blank_slide_layout = prs.slide_layouts[0]
  5.     slide = prs.slides.add_slide(blank_slide_layout)
  6.  
  7.    
  8.     #compute the associated filenames that we'll need
  9.     orig_fname=mask_fname.replace("./masks","./imgs").replace("_mask.png",".tif")
  10.     output_fname=mask_fname.replace("./masks","./output").replace("_mask.png","_class.png")
  11.    
  12.     #------- orig  - load and add to slide
  13.     img = cv2.cvtColor(cv2.imread(orig_fname),cv2.COLOR_BGR2RGB)
  14.     addimagetoslide(slide, img, Inches(0),Inches(0),Inches(5),Inches(5))
  15.    
  16.     #------ mask - load and add to slide
  17.     mask = cv2.cvtColor(cv2.imread(mask_fname),cv2.COLOR_BGR2RGB)
  18.     addimagetoslide(slide, mask, Inches(5),Inches(0),Inches(5),Inches(5))
  19.    
  20.     #------ output - load and add to slide
  21.     output = cv2.cvtColor(cv2.imread(output_fname),cv2.COLOR_BGR2RGB)
  22.     addimagetoslide(slide, output, Inches(5),Inches(5),Inches(5),Inches(5))
  23.    
  24.     #------ Fuse - load and add to slide
  25.     addimagetoslide(slide,blend2Images(output,mask), Inches(0),Inches(5),Inches(5),Inches(5))

We also likely want to know what image we’re looking at, so we can add this information as metadata on the side of the slide. Further, since we have the images loaded, we may also want to compute various metrics and display those values as well, to help put our results into a quantitative context. In this case, I’ve simply opted for true/false positives/negatives:

  1. #------ Lastly we can also add some metrics/results/values if we would like
  2.     # here we do simple FP/TP/TN/FN
  3.     txBox = slide.shapes.add_textbox(Inches(10), Inches(0),Inches(4),Inches(4) )
  4.     tf = txBox.text_frame
  5.     tf.text = f"{orig_fname}\n"
  6.     tf.text += f"Overall Pixel Agreement: {(output==mask).mean():.4f}\n"
  7.     tf.text += f"True Positive Rate: {(mask[output>0]>0).sum()/(output>0).sum():.4f}\n"
  8.     tf.text += f"False Positive Rate: {(mask[output==0]>0).sum()/(output==0).sum():.4f}\n"
  9.     tf.text += f"True Negative Rate: {(mask[output==0]==0).sum()/(output==0).sum():.4f}\n"
  10.     tf.text += f"False Negative Rate: {(mask[output>0]==0).sum()/(output>0).sum():.4f}\n"

Again we see simply by adding a new line character we can force each item to appear on a new line in the text box.

Lastely, and most importantly, we need to save our presentation to disk:

  1. prs.save(pptxfname)

And thats it!

Compressing the presentation

Note that in this case, upon saving, the file size is a whopping 28MB, likely too large to send via email. As stated in the previous post, we can compress this significantly within powerpoint or any other office product (e.g., MS Word). Instructions are reproduced here for convenience:

PowerPoint has the ability to compress the images to various sizes, this can be done by clicking on an image, clicking the Format menu option, and then compress pictures:

From there select the appropriate compression options, usually it makes sense to de-select “apply only to this picture” and to select the “E-mail” options to really shrink things down for sharing.

Afterward, our file goes from 28MB to 5MB (an 82% reduction) without noticeably affecting the presentation of the images.

You can find associated source code here and an example of the powerpoint presentation produced here.

Leave a Reply

Your email address will not be published. Required fields are marked *