Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

01/26/2023
by   Jinghui Liu, et al.
0

Modeling text-based time-series to make prediction about a future event or outcome is an important task with a wide range of applications. The standard approach is to train and test the model using the same input window, but this approach neglects the data collected in longer input windows between the prediction time and the final outcome, which are often available during training. In this study, we propose to treat this neglected text as privileged information available during training to enhance early prediction modeling through knowledge distillation, presented as Learning using Privileged tIme-sEries Text (LuPIET). We evaluate the method on clinical and social media text, with four clinical prediction tasks based on clinical notes and two mental health prediction tasks based on social media posts. Our results show LuPIET is effective in enhancing text-based early predictions, though one may need to consider choosing the appropriate text representation and windows for privileged text to achieve optimal performance. Compared to two other methods using transfer learning and mixed training, LuPIET offers more stable improvements over the baseline, standard training. As far as we are concerned, this is the first study to examine learning using privileged information for time-series in the NLP context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2021

Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

We study prediction of future outcomes with supervised models that use p...
research
11/29/2018

Leveraging Clinical Time-Series Data for Prediction: A Cautionary Tale

In healthcare, patient risk stratification models are often learned usin...
research
05/30/2023

LonXplain: Lonesomeness as a Consequence of Mental Disturbance in Reddit Posts

Social media is a potential source of information that infers latent men...
research
03/28/2022

Integrating Physiological Time Series and Clinical Notes with Transformer for Early Prediction of Sepsis

Sepsis is a leading cause of death in the Intensive Care Units (ICU). Ea...
research
10/25/2019

Textual Data for Time Series Forecasting

While ubiquitous, textual sources of information such as company reports...
research
05/03/2021

Explaining Outcomes of Multi-Party Dialogues using Causal Learning

Multi-party dialogues are common in enterprise social media on technical...
research
11/09/2021

American Hate Crime Trends Prediction with Event Extraction

Social media platforms may provide potential space for discourses that c...

Please sign up or login with your details

Forgot password? Click here to reset