Extract Annotations From ImageViewer Bigtiff xml into Matlab

Previously we looked at extracting annotations from Aperio Svs files. There are other image formats and annotation tools. Another commonly used tool in digital histology is ImageViewer, which makes it possible to view multi-page BigTiff image files.

In this case, we’ll assume that the annotated region of interest (ROI) is circled in a black rectangle. We had a pathologist annotate lymphocytes (blue), stroma (green) and tumor (red):

imageview
First what we’ll want to do is load the xml file and parse it, similar as before:

  1. xml_file=strrep(bigtiff_file,'.tif','.xml');
  2. xDoc = xmlread(xml_file);
  3. mkdir('subs');

Then we’ll go through all of the annotated pieces and look for ones which have a black line color. These have an upper left and bottom right corner.

  1. %find all rectangle regions and stick them in a struct, roi(..).ulx urx etc
  2. %etc
  3. Rois=[];
  4. Regions=xDoc.getElementsByTagName('Annotation'); % get a list of all the region tags
  5. for regioni = 0:Regions.getLength-1
  6. Region=Regions.item(regioni);
  7. if(str2double(Region.getAttribute('LineColor'))==0) % ROI
  8. %get a list of all the vertexes (which are in order)
  9. verticies=Region.getElementsByTagName('Vertex');
  10. ulx=str2double(verticies.item(0).getAttribute('X'));
  11. uly=str2double(verticies.item(0).getAttribute('Y')); %% upper left
  12.  
  13. lrx=str2double(verticies.item(1).getAttribute('X'));
  14. lry=str2double(verticies.item(1).getAttribute('Y')); %% lower right
  15.  
  16. Rois(end+1).lxlyrxry=[ulx uly lrx lry];
  17. end
  18. end

Now knowing where all of the ROIs are, we can iterate through each of the annotations and determine which ROIs it belongs to.

  1. num_roi=length(Rois);
  2. if(isempty(Rois))
  3. return
  4. end
  5. % loop through all remaining
  6. %if points are less than or greater than roi, add to roi(..).(lcolor).{i1}
  7.  
  8. Regions=xDoc.getElementsByTagName('Annotation'); % get a list of all the region tags
  9. for regioni = 0:Regions.getLength-1
  10. Region=Regions.item(regioni);
  11. linecolor=str2double(Region.getAttribute('LineColor'));
  12. if(linecolor~=0) % not an roi.
  13. %get a list of all the vertexes (which are in order)
  14. verticies=Region.getElementsByTagName('Vertex');
  15. xy=zeros(verticies.getLength-1,2); %allocate space for them
  16. for vertexi = 0:verticies.getLength-1 %iterate through all verticies
  17.  
  18. %get the x value of that vertex
  19. x=str2double(verticies.item(vertexi).getAttribute('X'));
  20.  
  21. %get the y value of that vertex
  22. y=str2double(verticies.item(vertexi).getAttribute('Y'));
  23. xy(vertexi+1,:)=[x,y]; % finally save them into the array
  24. end
  25.  
  26. %find which ROI it belongs to
  27. if(any((xy(:,1)>Rois(roii).lxlyrxry(1) )& ...
  28. (xy(:,1)<Rois(roii).lxlyrxry(3)) ... & (xy(:,2)>Rois(roii).lxlyrxry(2) )& ...
  29. (xy(:,2)<Rois(roii).lxlyrxry(4))))
  30. %found
  31. field=sprintf('c%d',linecolor);
  32. if(~isfield(Rois,field))
  33. Rois(roii).(sprintf('c%d',linecolor))={};
  34. end
  35. Rois(roii).(sprintf('c%d',linecolor)){end+1}=xy;
  36. end
  37. end
  38. end
  39. end

Finally, we iterate through all ROIs, extract them from the base level of the big tiff, and create separate binary masks for each selected color. In this case, blue for lymphocytes, green for stroma and red for tumor.

  1. % for all ROI, extract image, save, subtract corner from all points, make a
  2. % single mask of each color
  3.  
  4. color_fields=fields(Rois(1));
  5. color_fields(~cellfun(@(x)x(1)=='c',color_fields))=[];
  6.  
  7. for roii= 1: length(Rois)
  8.  
  9. Rows=[Rois(roii).lxlyrxry(2) Rois(roii).lxlyrxry(4)];
  10. Cols=[Rois(roii).lxlyrxry(1) Rois(roii).lxlyrxry(3)];
  11.  
  12. io=imread(bigtiff_file,'Index',3,'PixelRegion',{Rows,Cols});
  13. [nrow,ncol,ndim]=size(io);
  14. imwrite(io,sprintf('subs/%s_%d_%d.tif',bigtiff_file(1:end-4),...
  15. Rois(roii).lxlyrxry(2),Rois(roii).lxlyrxry(1)));
  16.  
  17. for colors=1:length(color_fields)
  18. annotations=Rois(roii).(color_fields{colors});
  19.  
  20. if(isempty(annotations))
  21. continue
  22. end
  23.  
  24. mask=zeros(nrow,ncol);
  25. for ai = 1: length(annotations)
  26. %make a mask and add it to the current mask
  27. mask=mask+poly2mask(annotations{ai}(:,1)-Rois(roii).lxlyrxry(1),...
  28. annotations{ai}(:,2)-Rois(roii).lxlyrxry(2),nrow,ncol);
  29.  
  30. end
  31.  
  32. imwrite(mask,sprintf('subs/%s_%d_%d_%s.png',bigtiff_file(1:end-4), ...
  33. Rois(roii).lxlyrxry(2),Rois(roii).lxlyrxry(1),color_fields{colors}));
  34. end
  35.  
  36. end

In the end, we’ve created 3 different masks which can then be used further down the pipeline:

PT 1_201501052111_12253_74605_c16711680PT 1_201501052111_12253_74605_c65280PT 1_201501052111_12253_74605_c255

Source available here

4 thoughts on “Extract Annotations From ImageViewer Bigtiff xml into Matlab”

    1. there is really nothing matlab “specific” about this code or approach, in the sense that using available python open source libraries it could easily be re-implemented, but i don’t believe i have that code implemented at the moment. although these days i’m using almost entirely python workflows, this is one of the components that we keep matlab around for since the code is already built and debugged and only needs to be used once per dataset, it hasn’t warranted the effort in porting it over yet. if you manage to do it, i’d be very interested in the result!

Leave a Reply

Your email address will not be published. Required fields are marked *