Knowledge-Rich BERT Embeddings for Readability Assessment

06/15/2021
by   Joseph Marvin Imperial, et al.
0

Automatic readability assessment (ARA) is the task of evaluating the level of ease or difficulty of text documents for a target audience. For researchers, one of the many open problems in the field is to make such models trained for the task show efficacy even for low-resource languages. In this study, we propose an alternative way of utilizing the information-rich embeddings of BERT models through a joint-learning method combined with handcrafted linguistic features for readability assessment. Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets, and obtaining as high as 12.4 We also show that the knowledge encoded in BERT embeddings can be used as a substitute feature set for low-resource languages like Filipino with limited semantic and syntactic NLP tools to explicitly extract feature values for the task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

Automatic Readability Assessment for Closely Related Languages

In recent years, the main focus of research on automatic readability ass...
research
10/19/2022

A Unified Neural Network Model for Readability Assessment with Feature Projection and Length-Balanced Loss

For readability assessment, traditional methods mainly employ machine le...
research
10/01/2021

Under the Microscope: Interpreting Readability Assessment Models for Filipino

Readability assessment is the process of identifying the level of ease o...
research
11/21/2022

L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi

Sentence representation from vanilla BERT models does not work well on s...
research
03/16/2022

KinyaBERT: a Morphology-aware Kinyarwanda Language Model

Pre-trained language models such as BERT have been successful at tacklin...
research
07/09/2021

Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment

Deep learning models for automatic readability assessment generally disc...
research
04/23/2021

Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages

Evaluating model robustness is critical when developing trustworthy mode...

Please sign up or login with your details

Forgot password? Click here to reset