Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment

by   Xinying Qiu, et al.

Deep learning models for automatic readability assessment generally discard linguistic features traditionally used in machine learning models for the task. We propose to incorporate linguistic features into neural network models by learning syntactic dense embeddings based on linguistic features. To cope with the relationships between the features, we form a correlation graph among features and use it to learn their embeddings so that similar features will be represented by similar embeddings. Experiments with six data sets of two proficiency levels demonstrate that our proposed methodology can complement BERT-only model to achieve significantly better performances for automatic readability assessment.


page 1

page 2

page 3

page 4


Knowledge-Rich BERT Embeddings for Readability Assessment

Automatic readability assessment (ARA) is the task of evaluating the lev...

Linguistic Features for Readability Assessment

Readability assessment aims to automatically classify text by the level ...

A Baseline Readability Model for Cebuano

In this study, we developed the first baseline readability model for the...

Visualizing and Measuring the Geometry of BERT

Transformer architectures show significant promise for natural language ...

Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL?

Do unsupervised methods for learning rich, contextualized token represen...

Under the Microscope: Interpreting Readability Assessment Models for Filipino

Readability assessment is the process of identifying the level of ease o...

Learning Embedded Representation of the Stock Correlation Matrix using Graph Machine Learning

Understanding non-linear relationships among financial instruments has v...