Application of Lexical Features Towards Improvement of Filipino Readability Identification of Children's Literature

01/22/2021
by   Joseph Marvin Imperial, et al.
0

Proper identification of grade levels of children's reading materials is an important step towards effective learning. Recent studies in readability assessment for the English domain applied modern approaches in natural language processing (NLP) such as machine learning (ML) techniques to automate the process. There is also a need to extract the correct linguistic features when modeling readability formulas. In the context of the Filipino language, limited work has been done [1, 2], especially in considering the language's lexical complexity as main features. In this paper, we explore the use of lexical features towards improving the development of readability identification of children's books written in Filipino. Results show that combining lexical features (LEX) consisting of type-token ratio, lexical density, lexical variation, foreign word count with traditional features (TRAD) used by previous works such as sentence length, average syllable length, polysyllabic words, word, sentence, and phrase counts increased the performance of readability models by almost a 5 of the most important features were shown to identify which features contribute the most in terms of reading complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2021

Alejandro Mosquera at SemEval-2021 Task 1: Exploring Sentence and Word Features for Lexical Complexity Prediction

This paper revisits feature engineering approaches for predicting the co...
research
05/11/2021

Using Diachronic Distributed Word Representations as Models of Lexical Development in Children

Recent work has shown that distributed word representations can encode a...
research
03/08/2023

Lexical Complexity Prediction: An Overview

The occurrence of unknown words in texts significantly hinders reading c...
research
04/24/2018

A Visual Distance for WordNet

Measuring the distance between concepts is an important field of study o...
research
08/26/2020

Machine learning approach of Japanese composition scoring and writing aided system's design

Automatic scoring system is extremely complex for any language. Because ...
research
04/14/2021

UPB at SemEval-2021 Task 1: Combining Deep Learning and Hand-Crafted Features for Lexical Complexity Prediction

Reading is a complex process which requires proper understanding of text...
research
05/12/2020

Detecting Multiword Expression Type Helps Lexical Complexity Assessment

Multiword expressions (MWEs) represent lexemes that should be treated as...

Please sign up or login with your details

Forgot password? Click here to reset