Noise Pollution in Hospital Readmission Prediction: Long Document Classification with Reinforcement Learning

05/04/2020
by   Liyan Xu, et al.
0

This paper presents a reinforcement learning approach to extract noise in long clinical documents for the task of readmission prediction after kidney transplant. We face the challenges of developing robust models on a small dataset where each document may consist of over 10K tokens with full of noise including tabular text and task-irrelevant sentences. We first experiment four types of encoders to empirically decide the best document representation, and then apply reinforcement learning to remove noisy text from the long documents, which models the noise extraction process as a sequential decision problem. Our results show that the old bag-of-words encoder outperforms deep learning-based encoders on this task, and reinforcement learning is able to improve upon baseline while pruning out 25 reinforcement learning is able to identify both typical noisy tokens and task-specific noisy text.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2018

Deep Communicating Agents for Abstractive Summarization

We present deep communicating agents in an encoder-decoder architecture ...
research
10/08/2019

Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach

Traditional sequence-to-sequence (seq2seq) models and other variations o...
research
11/03/2018

Relation Mention Extraction from Noisy Data with Hierarchical Reinforcement Learning

In this paper we address a task of relation mention extraction from nois...
research
07/19/2021

MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes

We introduce MemSum (Multi-step Episodic Markov decision process extract...
research
11/28/2016

Learning to Compose Words into Sentences with Reinforcement Learning

We use reinforcement learning to learn tree-structured neural networks f...
research
07/07/2011

Text Classification: A Sequential Reading Approach

We propose to model the text classification process as a sequential deci...
research
10/03/2017

Event Identification as a Decision Process with Non-linear Representation of Text

We propose scale-free Identifier Network(sfIN), a novel model for event ...

Please sign up or login with your details

Forgot password? Click here to reset