t-SS3: a text classifier with dynamic n-grams for early risk detection over text streams

11/11/2019
by   Sergio G. Burdisso, et al.
1

A recently introduced classifier, called SS3, has shown to be well suited to deal with early risk detection (ERD) problems on text streams. It obtained state-of-the-art performance on early depression and anorexia detection on Reddit in the CLEF's eRisk open tasks. SS3 was created to naturally deal with ERD problems since: it supports incremental training and classification over text streams and it can visually explain its rationale. However, SS3 processes the input using a bag-of-word model lacking the ability to recognize important word sequences. This could negatively affect the classification performance and also reduces the descriptiveness of visual explanations. In the standard document classification field, it is very common to use word n-grams to try to overcome some of these limitations. Unfortunately, when working with text streams, using n-grams is not trivial since the system must learn and recognize which n-grams are important “on the fly”. This paper introduces t-SS3, a variation of SS3 which expands the model to dynamically recognize useful patterns over text streams. We evaluated our model on the eRisk 2017 and 2018 tasks on early depression and anorexia detection. Experimental results show that t-SS3 is able to improve both, existing results and the richness of visual explanations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2019

PySS3: A Python package implementing a novel text classifier with visualization tools for Explainable AI

A recently introduced text classifier, called SS3, has obtained state-of...
research
05/18/2019

A Text Classification Framework for Simple and Effective Early Depression Detection Over Social Media Streams

With the rise of the Internet, there is a growing need to build intellig...
research
12/27/2022

A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition

This study focuses on improving the optical character recognition (OCR) ...
research
07/09/2019

Contextual One-Class Classification in Data Streams

In machine learning, the one-class classification problem occurs when tr...
research
12/12/2022

PERFEX: Classifier Performance Explanations for Trustworthy AI Systems

Explainability of a classification model is crucial when deployed in rea...
research
02/13/2019

Joint Tracking of Multiple Quantiles Through Conditional Quantiles

Estimation of quantiles is one of the most fundamental real-time analysi...
research
02/13/2016

Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams

Early detection and precise characterization of emerging topics in text ...

Please sign up or login with your details

Forgot password? Click here to reset