Barcodes for Medical Image Retrieval using Autoencoded Radon Transform

Dec 2016

The paper discusses the problem of content-based image retrieval (CBIR), which involves finding similar images in a database based on the content of a query image. The traditional approach involves using descriptive texts or keywords, but the paper focuses on searching within the image itself. The authors propose using Radon barcodes, which are binary descriptors generated from Radon projections, as a compact and efficient representation of image content. However, they identify a limitation in the binarization process of Radon projections, which leads to information loss and retrieval errors.

To address this issue, the authors propose training autoencoders to generate Radon barcodes. Autoencoders are neural network models that can learn to compress and reconstruct data. By binarizing the compressed version of Radon projections using autoencoders, they aim to reduce information loss and improve retrieval accuracy. Since Radon projections from neighboring angles are highly redundant, the proposed approach also serves as a neural approach to redundancy reduction. The paper proceeds with a brief review of relevant literature in Section 2, highlighting the shift towards binary descriptors in CBIR methods. In Section 3, the authors describe the proposed method of using autoencoders for generating Radon barcodes. They discuss the dataset, error measurements, settings of the autoencoders, and present the experimental results in Section 4.

In this section, the paper provides a brief review of the relevant literature on image retrieval, autoencoders, and Radon barcodes.

Image Retrieval: The authors discuss the shift towards binary descriptors in recent annotation or tagging methods for image retrieval. They mention BRIEF, a binary feature point descriptor proposed by Calonder et al. [2], and ORB, a rotation-invariant and noise-resistant binary descriptor based on BRIEF, proposed by Rublee et al. [14]. They also mention BRISK, a binary method for image key point detection and matching, introduced by Leutenegger et al. [12], which demonstrates good performance at low computational cost. The authors highlight the introduction of Radon barcodes based on the Radon transform for medical image retrieval on the IRMA dataset [16]. They also note the increasing use of autoencoders for image retrieval tasks [5, 3], mentioning the successful application of deep convolutional neural networks by Krizhevsky et al. in the ImageNet LSVRC-2010 challenge [7].

Fig. 1. All projections (P1,P2,P3,P4) generated by the
Radon transform are thresholded to generate code fragments
C1,C2,C3,C4 resulting in a barcode [C1 C2 C3 C4] [Source:
[16]].

The paper explains that autoencoders are a type of artificial neural network trained to encode input into representations that can reconstruct the input. The transformation from input x to hidden representation y is achieved through a deterministic mapping using a non-linear function (e.g., Sigmoid), weight matrix W, and offset vector b. The original input can be reconstructed by mapping back the hidden representation. The reconstruction is not an exact replica of the input but rather represents the parameters of a distribution that can generate the input with high probability. The paper introduces the squared error as a measure to minimize the reconstruction error.

Radon Barcodes: The authors describe the concept of Radon barcodes (RBC), which involves applying the Radon transform to an image and binarizing the projections to minimize information loss. The Radon transform projects the image along various projection angles, generating a new image R(ρ,θ) where ρ = xcosθ + ysinθ. The Radon transform can be expressed using the Dirac delta function. The paper mentions the use of local thresholds for each projection angle to generate a barcode of thresholded projections. A typical value, such as the median of non-zero values of each projection, is proposed as the threshold. The paper presents Algorithm 1 to describe the generation of Radon barcodes. The authors highlight that their work differs from the original work on Radon barcodes [16] by utilizing the compressing capability of autoencoders to reduce redundancies in Radon projections, leading to improved retrieval results.

The Proposed Method: In this section, the paper explains the preprocessing and autoencoding steps involved in generating the autoencoded Radon barcodes.

Pre-Processing of Images: The images in the training dataset are preprocessed by resizing them to fixed dimensions, such as 32×32, to ensure inputs of the same length for the autoencoder. The Radon transform is applied with a selected number of projection angles (nθ) to extract the Radon features or projections from each image. The Radon features are normalized to a range between 0 and 1 by dividing each element in the feature vector by the maximum value of that projection angle. The normalized Radon features are then ready to be fed into the autoencoder.

Autoencoded Radon Barcodes: Once the autoencoder is trained, the barcode generation process begins. Each normalized Radon feature vector is inputted to the autoencoder, and the output of each hidden layer (the Sigmoid function value of each neuron) is examined. If the value is greater than 0.5, it is represented as a “1” in the barcode, and if it is less than 0.5, it becomes a “0”. The barcode can be generated for each hidden layer in the autoencoder. The paper includes Figure 2, which illustrates the proposed method, and Figure 3, which displays sample images from the IRMA dataset along with their Radon barcodes (with local thresholding) and autoencoded Radon barcodes.

Autoencoder Setting: The authors use a traditional autoencoder instead of an optimized one, as the optimized version trains slowly and does not provide the most accurate results compared to more advanced autoencoders. They adopt mini-batch stochastic gradient descent, a standard learning algorithm for neural networks, and do not implement any regularization techniques. The weights of the autoencoder are initialized using a Gaussian distribution, and the biases are initialized with a Gaussian distribution with mean 0 and standard deviation 1. The mean squared error is used as the cost function. Other parameters, such as learning rate, mini-batch size, and training epochs, are empirically set and described in the following section.

Fig. 2. Schematic illustration of the proposed method. The Radon projections (P1; P2; : : : ) of the preprocessed image are autoencoded. For n hidden layers, n barcodes can be generated.

IRMA Dataset: The Image Retrieval in Medical Applications (IRMA) database, containing over 14,000 x-ray images, is described. The images are classified into 193 categories based on the IRMA code, a string of 13 characters representing imaging modality, body orientations, body region, and biological system examined. The dataset consists of 12,677 training images and 1,733 testing images. Due to imbalanced class distribution, IRMA x-ray images pose a challenging benchmark.

Error Measurements: The paper introduces the evaluation scheme for computing the difference between the IRMA codes of testing images and the first retrieved hit using the proposed approach. The total error for all test images is calculated based on the structure and characters of the IRMA code.

Parameters for Autoencoder: The autoencoder is trained using the mean squared error as the quadratic cost function. Stochastic gradient descent is employed with a mini-batch size of 10 images over 300 epochs. The learning rate is set to 0.5, and no regularization techniques are used. The autoencoder includes both single hidden layer (shallow layer) and multiple layers (deep layer), where each hidden layer reduces or increases the number of neurons by a factor of 2.

Performance: The performance of the proposed method is evaluated by generating Radon barcodes and autoencoded Radon barcodes for comparison. The Radon barcodes are used to compute error scores according to a defined equation. The experiments involve different normalized image sizes and numbers of projection angles. The results show that autoencoded Radon barcodes achieve better performance than regular Radon barcodes. Increasing image size and the number of projections generally improve the results, although they lead to longer barcodes.

Comparison with SURF and BRISK: The performance of the proposed approach is compared with the SURF and BRISK methods. Locality-sensitive hashing (LSH) is used for feature hashing and retrieval. The results demonstrate that both SURF and BRISK have limitations in extracting descriptors for some images, with BRISK having a high error rate. Although SURF performs better than Radon barcodes, it has non-binary descriptors that require higher storage. Autoencoded Radon barcodes exhibit the lowest error rate and lower storage requirements.

The paper provides detailed analysis and experimental results to support the effectiveness of the proposed approach for medical image retrieval compared to existing methods.

Fig. 3. Visual comparison of Radon barcode (top) and autoencoder Radon barcode (bottom) for sample x-ray images from
IRMA dataset. Images are normalized to 32  32 with 16 projections.

The paper discusses the effectiveness of binary descriptors for image retrieval in the context of big image data. Radon barcodes are introduced as a promising framework for generating content-based binary features, particularly suitable for tagging medical images. The paper focuses on investigating different aspects of Radon barcodes, including the method for binarizing Radon projections, which significantly affects the descriptiveness of the barcode. In this study, the authors propose the use of autoencoded Radon barcodes. By employing an autoencoder with either one or three hidden layers, Radon barcodes are generated by thresholding compressed projections obtained from the output of the hidden layers. The researchers conducted experiments using the IRMA dataset, which consists of 14,410 x-ray images, to evaluate the performance of the proposed approach. The experimental results demonstrate that thresholding via autoencoders outperforms local thresholding based on the median of non-zero projection values. Furthermore, the findings indicate that deeper autoencoders produce barcodes with better performance compared to shallow networks. The paper highlights the potential of autoencoded Radon barcodes as an effective method for image tagging and retrieval, particularly in the domain of medical images.

Additional details: Barcodes for Medical Image Retrieval Using Autoencoded Radon Transform

Image Search in Histopathology