Lymph Node Detection in T2 MRI with Transformers

Identification of lymph nodes (LN) in T2 Magnetic Resonance Imaging (MRI) is an important step performed by radiologists during the assessment of lymphoproliferative diseases. The size of the nodes play a crucial role in their staging, and radiologists sometimes use an additional contrast sequence such as diffusion weighted imaging (DWI) for confirmation. However, lymph nodes have diverse appearances in T2 MRI scans, making it tough to stage for metastasis. Furthermore, radiologists often miss smaller metastatic lymph nodes over the course of a busy day. To deal with these issues, we propose to use the DEtection TRansformer (DETR) network to localize suspicious metastatic lymph nodes for staging in challenging T2 MRI scans acquired by different scanners and exam protocols. False positives (FP) were reduced through a bounding box fusion technique, and a precision of 65.41% and sensitivity of 91.66% at 4 FP per image was achieved. To the best of our knowledge, our results improve upon the current state-of-the-art for lymph node detection in T2 MRI scans.


Universal Lymph Node Detection in T2 MRI using Neural Networks

Purpose: Identification of abdominal Lymph Nodes (LN) that are suspiciou...

Splenomegaly Segmentation using Global Convolutional Kernels and Conditional Generative Adversarial Networks

Spleen volume estimation using automated image segmentation technique ma...

Deep Predictive Motion Tracking in Magnetic Resonance Imaging: Application to Fetal Imaging

Fetal magnetic resonance imaging (MRI) is challenged by uncontrollable, ...

Experimenting with Knowledge Distillation techniques for performing Brain Tumor Segmentation

Multi-modal magnetic resonance imaging (MRI) is a crucial method for ana...

Automatic detection of acute ischemic stroke using non-contrast computed tomography and two-stage deep learning model

Background and Purpose: We aimed to develop and evaluate an automatic ac...

Autotuning Plug-and-Play Algorithms for MRI

For magnetic resonance imaging (MRI), recently proposed "plug-and-play" ...

Active Selection of Classification Features

Some data analysis applications comprise datasets, where explanatory var...

1 Introduction

Lymph nodes (LN) are a part of the lymphatic system that helps the body fight infection by removing foreign substances. Enlarged and metastatic LN require accurate identification especially when they are present at sites that do not correspond to the first site of lymphatic spread, as it indicates distant metastasis [1]. These enlarged LN can be sized according to the AJCC guidelines [1], which cover the management of cancer and lymphoproliferative diseases. LN are usually imaged with multi-parametric MRI, e.g. T2 and Diffusion Weighted Imaging (DWI), but the type of scanners used and exam protocols preferred span the gamut across different institutions. Moreover, their diverse appearance, irregular shapes, and varying anatomical location make it difficult to stage them. Nodal size is usually measured with two orthogonal lines: the long and short axis diameters (LAD and SAD), and both are usually required for charting the course of therapy; however, these guidelines also vary by institution. Due to these imaging and workflow issues, there is a need for automated LN detection in T2 MRI images for sizing.

There is limited prior on the LN detection task in MRI scans [15, 4] whereas the majority of LN detection research has been focused towards CT scans [8, 11, 10]. Typically, T2 MRI and DWI are preferred in clinical practice for LN staging, but for algorithmic development, registration of DWI to T2 scans is needed [15]. In this work, we focus only on identifying LN in challenging T2 MRI scans acquired with different scanners and exam protocols, quantify the detection performance of state-of-the-art one-stage detection networks [13, 6, 14], and surpass them with the recently published DEtection TRansformer (DETR) network. Our intent is centered on the belief that the reliable detection of LN in T2 MRI scans can be supplemented by DWI scans later. We also boost the detection performance with bounding box fusion techniques [12] to decrease the false positive rate. Our results exceed the capabilities of the previous lymph node detection approaches with a precision of 65.41% and sensitivity of 91.66% at 4 FP per image.

2 Methods

Anchor-Free One-Stage Detectors. We quantified the performance of state-of-the-art one-stage anchor-free object detectors on the LN detection task in T2 MRI: 1) FCOS [13], 2) FoveaBox [6], and 3) VFNet [14]

. These detectors are superior to anchor-based detectors (e.g. RetinaNet) and two-stage detectors (e.g Faster RCNN) because they skip the region proposal stage, and directly predict the bounding box coordinates and class probabilities for different categories in a single pass. FCOS

[13] employs multi-level prediction of feature maps for object detection inside a Fully Convolutional Network (FCN). It also computes a centerness score to reduce the FP, which tend to be far away from the target object center. VFNet combines FCOS (without centerness branch) with efficient sample selection, integrates a classification score based on the IoU value between the ground truth and the prediction into a novel IoU-aware Varifocal loss, and refines the bounding box predictions. FoveaBox [6]

has a ResNet50 backbone to compute input image features that is used by a fovea head network, which estimates the object occurrence possibility through per-pixel classification on the backbone’s output.

DEtection Transformer (DETR). DETR [2] uses the bipartite set matching loss and parallel decoding to detect LN. It uses a ResNet50 backbone to compute image features and adds spatial positional encoding to them, feeds the features into an encoder-decoder architecture with self-attention modules, and finally uses feed-forward network heads that uniquely assign predictions to the ground truth in parallel. Different from the original implementation [2], we replaced cross entropy loss as the classification loss in the bounding box matching cost with the focal loss [7] to overcome the class imbalance problem and weighted it with .

Weighted Boxes Fusion.

Five epochs from each detector with the lowest validation loss were used for prediction, thereby generating multiple bounding box predictions (with confidence scores) at the nodal location. Some of these predictions clustered together in common regions in the image, and Weighted Boxes Fusion

[12] was employed to combine the clusters and yield precise predictions (as seen in Fig. 1).

Figure 1: Columns (a) - (c) show the lymph node detection results of the one-stage detectors (FCOS, FoveaBox, VFNet) on four different T2 MRI images. Column (d) displays the result of our proposed DETR transformer following Weighted Boxes Fusion. Green boxes: ground truth, yellow: true positives, and red: false positives.

3 Experiments and Results

3.1 Data

The lymph node dataset contained abdominal MRI scans downloaded from the National Institutes of Health (NIH) Picture Archiving and Communication System (PACS), and were acquired between January 2015 and September 2019. Initially, 584 T2-weighted MRI scans and associated radiology reports from different patients (=584) were downloaded. The nodal extent and size measurements were extracted from the reports using NLP [9]. An experienced radiologist checked the collected data, and removed incorrect annotations and scans containing only one LN annotation (either LAD or SAD measures). This resulted in a total of 376 T2 scans with 520 distinct LN that had both the LAD and SAD measures. The voxels in the scans were normalized to the [1%, 99%] of their intensity range [5] so as to boost contrast between bright and dark structures, and finally histogram equalized. The final dataset was then randomly divided into training (60%, 225 scans), validation (20%, 76 scans), and test (20%, 75 scans) splits at the patient-level. The resulting scans had dimensions in the range from (256 640) (192 640) (18 60) voxels.

3.2 Implementation

When reading scans, radiologists scroll through 1-3 slices of the volume, determine the extent of the lymph nodes, and verify the finding using another sequence (e.g. DWI). To mimic their workflow, we use 3-slice T2 MRI images with the center slice containing the radiologist annotated LN as the input to a detection network framework [3]

. The backbone for the one-stage detectors (FCOS, FoveaBox, VFNet) was ResNet50 (pre-trained with MS COCO weights) and standard data augmentation was performed: random flips, crops, shifts and rotations in the range of [0, 32] pixels, and [0, 10] degrees respectively. To train the one-stage detectors, we used a batch size of 2, learning rate of 1e-3, and trained each model for 24 epochs. The 5 epochs with the lowest validation loss were ensembled together and used for LN detection. For DETR, we used a learning rate of 1.25e-05 and trained the model for 150 epochs. All experiments were run on a workstation running Ubuntu 16.04LTS and containing a NVIDIA Tesla V100 GPU. Our results are shown in Table


3.3 Results

Among the various one-stage detectors, it was difficult to distinguish the best at LN detection. VFNet had the highest mAP of 63.91% over FoveaBox and FCOS, but FCOS had the highest sensitivity of 88.09% at 4 FP per image. In contrast, the propsed DETR transformer had the highest mAP of 65.41% and sensitivity of 91.66% at 4 FP per image. This shows the importance of the self-attention module and spatial positional encoding in the DETR network. We also compared our results against other T2 MRI-based work; compared against Zhao et al. [15], our mAP is 65.41% vs 64.5%, and recall is 91.66% vs 62.6% at 8 FP. Contrasted with Debats et al. [4], our sensitivity is 91.66% vs 80% at 8 FP per image. We also compared our results against CT-based work; against Liu et al.[8], Seff et al.[11], and Roth et al.[10], we obtain sensitivities of 91.66% at 4 FP vs 88% at 4 FP, 89% at 6 FP, and 90% at 6 FP respectively.

We also split our dataset according to the size of the lymph node and evaluated our performance. For LN with SAD 10mm, the DETR shows a moderate performance of 53.57% mAP and sensitivity of 87.09% at 4 FP per image. As shown in Fig. 1, we attribute the lower mAP to the limitation of the DETR network in detecting small objects [2]. Our values are lower than Zhao et al[15] at 65% recall, but we still achieve a much higher recall without the use of DWI sequences. But on LN with SAD 10mm, DETR achieves a mAP of 70.84% and sensitivity of 94.33% at 4 FP per image. These results are again consistent with past literature [15, 4], yet we outperform their results with a 4% increase in sensitivity. Our proposed DETR executes in 253ms/20s per image/volume vs. 218ms/17s from FCOS, 134ms/10s from FoveaBox, and 276ms/22s from VFNet respectively.

max width= Method mAP S@0.5 S@1 S@2 S@4 S@6 S@8 S@16 FCOS [13] 60.09 61.90 77.38 83.33 88.09 89.28 89.28 89.28 FoveaBox [6] 61.67 61.90 76.19 79.76 84.52 88.09 89.28 89.28 VFNet [14] 63.91 67.85 75 80.95 83.33 83.33 83.33 83.33 Proposed DETR 65.41 65.47 76.19 88.09 91.66 91.66 91.66 91.66 Proposed DETR (SAD 10mm) 53.57 58.06 67.77 80.64 87.09 87.09 87.09 87.09 Proposed DETR (SAD 10mm) 70.84 69.81 79.24 92.45 94.33 94.33 94.33 94.33 Zhao 2020 [15] (MRI) 64.5 62.6 Debats 2019 [4] (MRI) 80 Liu 2016 [8] (CT) 88 Seff 2015 [11] (CT) 89 Roth 2014 [10] (CT) 90

Table 1: Detection performance of various detectors and our proposed DETR transformer. ‘S" stands for Sensitivity @[0.5, 1, 2, 4, 6, 8, 16] FP. – indicates unavailable metric values.

4 New Work

We proposed a novel change to the DETR transformer architecture for the challenging task of LN detection in T2 MRI, and merged predictions with weighted boxes fusion to reduce the false positive rate. We also quantified the performance of state-of-the-art detection networks on the LN detection task. Our detection network achieves a clinically acceptable 65.41 mAP and 91.66 recall at 4 FP per image. Our results outperform previously published LN detection methods on T2 MRI scans.

5 Conclusions

In this work, we proposed a novel change to the DETR architecture to enable improved LN detection and applied weighted boxes fusion to merge the predicted bounding box clusters. We also quantified the performance of three state-of-the-art one-stage detectors on the LN detection task. Our model was able to detect LN with 65.41 mAP and 91.66 recall at 4 FP, and surpassed the prior work in T2 MRI.

6 Acknowledgements.

This work was supported by the Intramural Research Programs of the NIH Clinical Center and NIH National Library of Medicine. We also thank Jaclyn Burge for the helpful comments and suggestions.


  • [1] M. Amin and et al. (2017) AJCC cancer staging manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA: Canc. J. Clin. 67 (2), pp. 93–99. External Links: Document Cited by: §1.
  • [2] N. Carion and et al. (2020) End-to-end object detection with transformers. CoRR abs/2005.12872. External Links: Link, 2005.12872 Cited by: §2, §3.3.
  • [3] K. Chen and et al. (2019) MMDetection: open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155. Cited by: §3.2.
  • [4] O. D. et al (2019)

    Lymph node detection in mr lymphography: false positive reduction using multi-view convolutional neural networks

    PeerJ (), pp. . External Links: Document Cited by: §1, §3.3, §3.3, Table 1.
  • [5] M. e. a. Kociołek (2020) Does image normalization and intensity resolution impact texture classification?. Computerized Medical Imaging and Graphics 81, pp. 101716. External Links: ISSN 0895-6111, Document, Link Cited by: §3.1.
  • [6] T. Kong and et al. (2019) FoveaBox: beyond anchor-based object detector. CoRR abs/1904.03797. External Links: Link, 1904.03797 Cited by: §1, §2, Table 1.
  • [7] T. Lin and et al. (2017) Focal loss for dense object detection. In ICCV, Cited by: §2.
  • [8] J. Liu and et al (2016)

    Mediastinal lymph node detection and station mapping on chest ct using spatial priors and random forest

    Medical Physics 43 (7), pp. 4362–4374. External Links: Document, Link, Cited by: §1, §3.3, Table 1.
  • [9] Y. Peng and et al. (2020) Automatic recognition of abdominal lymph nodes from clinical text. In Clin. Nat. Lang. Proc., Online, pp. 101–110. External Links: Link, Document Cited by: §3.1.
  • [10] H. R. e. a. Roth (2014)

    A new 2.5d representation for lymph node detection using random sets of deep convolutional neural network observations

    In MICCAI, pp. 520–527. External Links: ISBN 978-3-319-10404-1 Cited by: §1, §3.3, Table 1.
  • [11] A. e. a. Seff (2015) Leveraging mid-level semantic boundary cues for automated lymph node detection. In MICCAI, pp. 53–61. External Links: ISBN 978-3-319-24571-3 Cited by: §1, §3.3, Table 1.
  • [12] R. Solovyev and et al. (2021) Weighted boxes fusion: ensembling boxes from different object detection models. Image and Vision Computing 107, pp. 104117. External Links: ISSN 0262-8856, Document, Link Cited by: §1, §2.
  • [13] Z. Tian and et al. (2019) FCOS: fully convolutional one-stage object detection. In ICCV, Cited by: §1, §2, Table 1.
  • [14] H. Zhang and et al. (2021) VarifocalNet: an iou-aware dense object detector. In CVPR, Cited by: §1, §2, Table 1.
  • [15] X. e. al. Zhao (2020) Deep learning–based fully automated detection and segmentation of lymph nodes on multiparametric-mri for rectal cancer: a multicentre study. EBioMedicine 56 (102780), pp. . External Links: Document Cited by: §1, §3.3, §3.3, Table 1.