A Unified Neural Network Model for Readability Assessment with Feature Projection and Length-Balanced Loss

10/19/2022
by   Wenbiao Li, et al.
0

For readability assessment, traditional methods mainly employ machine learning classifiers with hundreds of linguistic features. Although the deep learning model has become the prominent approach for almost all NLP tasks, it is less explored for readability assessment. In this paper, we propose a BERT-based model with feature projection and length-balanced loss (BERT-FP-LBL) for readability assessment. Specially, we present a new difficulty knowledge guided semi-supervised method to extract topic features to complement the traditional linguistic features. From the linguistic features, we employ projection filtering to extract orthogonal features to supplement BERT representations. Furthermore, we design a new length-balanced loss to handle the greatly varying length distribution of data. Our model achieves state-of-the-art performances on two English benchmark datasets and one dataset of Chinese textbooks, and also achieves the near-perfect accuracy of 99% on one English dataset. Moreover, our proposed model obtains comparable results with human experts in consistency test.

READ FULL TEXT
research
06/15/2021

Knowledge-Rich BERT Embeddings for Readability Assessment

Automatic readability assessment (ARA) is the task of evaluating the lev...
research
07/09/2021

Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment

Deep learning models for automatic readability assessment generally disc...
research
09/25/2021

Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features

We report two essential improvements in readability assessment: 1. three...
research
05/26/2020

Comparing BERT against traditional machine learning text classification

The BERT model has arisen as a popular state-of-the-art machine learning...
research
04/26/2022

Disambiguation of morpho-syntactic features of African American English – the case of habitual be

Recent research has highlighted that natural language processing (NLP) s...
research
10/24/2022

Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

Recent work has shown that neural feature- and representation-learning, ...
research
08/17/2022

Boosting Distributed Training Performance of the Unpadded BERT Model

Pre-training models are an important tool in Natural Language Processing...

Please sign up or login with your details

Forgot password? Click here to reset