Token Imbalance Adaptation for Radiology Report Generation

04/18/2023
by   Yuexin Wu, et al.
0

Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. infrequent tokens for text generators feeding with medical images. To solve the challenge, we propose the Token Imbalance Adapter (TIMER), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2019

Attending to Future Tokens For Bidirectional Sequence Generation

Neural sequence generation is typically performed token-by-token and lef...
research
02/08/2022

Counterfactual Multi-Token Fairness in Text Classification

The counterfactual token generation has been limited to perturbing only ...
research
10/09/2020

Token-level Adaptive Training for Neural Machine Translation

There exists a token imbalance phenomenon in natural language as differe...
research
09/18/2019

Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes

In sequence modeling tasks the token order matters, but this information...
research
03/26/2020

TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation

Natural Language Generation (NLG) models are prone to generating repetit...
research
12/13/2022

TIER: Text-Image Entropy Regularization for CLIP-style models

In this paper, we study the effect of a novel regularization scheme on c...
research
12/13/2021

A Study on Token Pruning for ColBERT

The ColBERT model has recently been proposed as an effective BERT based ...

Please sign up or login with your details

Forgot password? Click here to reset