Consensus Contouring Inter-Observer Variability in Image Segmentation

Inter-Observer Variability in Image Segmentation: Prostate MR Images of 15 Patients + 5 Segments/Image

This dataset was proposed in the following paper:

Fast Barcode Retrieval for Consensus Contouring
H.R.Tizhoosh , G.J.Czarnota, arXiv:1709.10197v1 [cs.CV] 28 Sep 2017

The MR images used in this study were derived from an online database. The database contains T2-weighted MR volume datasets, provided by Brigham and Women’s Hospital, the National Center for Image-guided Therapy, and Harvard Medical School. The images comprised T2-weighted MR images (T2W-MR) with endorectal coils. The pulse-sequence groups in the DICOM headers of most of the T2-weighted images were marked fast-spin echo (FSE), although some were marked as fast-relaxation fast-spin echo-accelerated (FRFSE-XL). The dataset contained images with slice thickness ranging from 2.5mm to 4.0mm, and varying contrast levels and signal-to-noise characteristics. All of the images were captured at a depth of 16 bits, and they varied in size from 256×256 to 512×512 pixels.

We randomly selected 15 patients (out of more than 100) with a total of 558 slices, from which 145 slices were contoured by all 5 oncologists, resulting in a total of 725 segments (All DICOM images and their manual segmentations were provided by Segasist Technologies, Waterloo, ON, Canada). Similar to the validation using simulated images, we first ran STAPLE on all user segments to generate a consensus for each slice. (Note that this is a “regular” consensus, insofar as all experts were available to mark the same image.) After we have the regular consensus, we can measure the agreement of each user using this consensus. This basically measures the extent to which each user has contributed to the consensus for that image.

Download the paper

Download file