Benchmarking Automatic Detection of Psycholinguistic Characteristics for Better Human-Computer Interaction

12/17/2020
by   Sanja Štajner, et al.
0

When two people pay attention to each other and are interested in what the other has to say or write, they almost instantly adapt their writing/speaking style to match the other. For a successful interaction with a user, chatbots and dialog systems should be able to do the same. We propose framework consisting of five psycholinguistic textual characteristics for better human-computer interaction. We describe annotation processes for collecting the data, and benchmark five binary classification tasks, experimenting with different training sizes and model architectures. We perform experiments in English, Spanish, German, Chinese, and Arabic. The best architectures noticeably outperform several baselines and achieve macro-averaged F1-scores between 72 achieved even with a small amount of training data. The proposed framework proved to be fairly easy to model for various languages even with small amount of manually annotated data if right architectures are used. At the same time, it showed potential for improving user satisfaction if applied in existing commercial chatbots.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2017

RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Open-domain human-computer conversation has been attracting increasing a...
research
05/22/2023

Language Models for German Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training

Automatic text simplification systems help to reduce textual information...
research
10/31/2016

Generating Sentiment Lexicons for German Twitter

Despite a substantial progress made in developing new sentiment lexicon ...
research
06/08/2023

Teaching AI to Teach: Leveraging Limited Human Salience Data Into Unlimited Saliency-Based Training

Machine learning models have shown increased accuracy in classification ...
research
02/27/2023

Epicurus at SemEval-2023 Task 4: Improving Prediction of Human Values behind Arguments by Leveraging Their Definitions

We describe our experiments for SemEval-2023 Task 4 on the identificatio...
research
08/02/2018

Cyberbullying Detection -- Technical Report 2/2018, Department of Computer Science AGH, University of Science and Technology

The research described in this paper concerns automatic cyberbullying de...
research
09/21/2018

Paraphrase Detection on Noisy Subtitles in Six Languages

We perform automatic paraphrase detection on subtitle data from the Opus...

Please sign up or login with your details

Forgot password? Click here to reset