Under the Microscope: Interpreting Readability Assessment Models for Filipino

by   Joseph Marvin Imperial, et al.

Readability assessment is the process of identifying the level of ease or difficulty of a certain piece of text for its intended audience. Approaches have evolved from the use of arithmetic formulas to more complex pattern-recognizing models trained using machine learning algorithms. While using these approaches provide competitive results, limited work is done on analyzing how linguistic variables affect model inference quantitatively. In this work, we dissect machine learning-based readability assessment models in Filipino by performing global and local model interpretation to understand the contributions of varying linguistic features and discuss its implications in the context of the Filipino language. Results show that using a model trained with top features from global interpretation obtained higher performance than the ones using features selected by Spearman correlation. Likewise, we also empirically observed local feature weight boundaries for discriminating reading difficulty at an extremely fine-grained level and their corresponding effects if values are perturbed.



There are no comments yet.


page 5

page 6

page 8


Diverse Linguistic Features for Assessing Reading Difficulty of Educational Filipino Texts

In order to ensure quality and effective learning, fluency, and comprehe...

Knowledge-Rich BERT Embeddings for Readability Assessment

Automatic readability assessment (ARA) is the task of evaluating the lev...

A Readable Read: Automatic Assessment of Language Learning Materials based on Linguistic Complexity

Corpora and web texts can become a rich language learning resource if we...

Firm-based relatedness using machine learning

The relatedness between an economic actor (for instance a country, or a ...

Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment

Deep learning models for automatic readability assessment generally disc...

Predicting the top and bottom ranks of billboard songs using Machine Learning

The music industry is a 130 billion industry. Predicting whether a song ...

Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features

We report two essential improvements in readability assessment: 1. three...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.