A Dataset for measuring reading levels in India at scale

11/27/2019
by   Dolly Agarwal, et al.
0

One out of four children in India are leaving grade eight without basic reading skills. Measuring the reading levels in a vast country like India poses significant hurdles. Recent advances in machine learning opens up the possibility of automating this task. However, the datasets are primarily in English. To solve this assessment problem and advance deep learning research in regional Indian languages, we present the ASER dataset of children in the age group of 6-14. The dataset consists of 5,300 subjects generating 81,658 labeled audio clips in Hindi, Marathi and English. These labels represent expert opinions on the ability of the child to read at a specified level. Using this dataset, we built a simple ASR-based classifier. Early results indicate that we can achieve a prediction accuracy of 86 percent for the English language. Considering the ASER survey spans half a million subjects, this dataset can grow to those scales.

READ FULL TEXT
research
04/10/2017

Automatic Classification of the Complexity of Nonfiction Texts in Portuguese for Early School Years

Recent research shows that most Brazilian students have serious problems...
research
03/09/2021

Attention-driven read-aloud technology increases reading comprehension in children with reading disabilities

The paper presents the design of an assistive reading tool that integrat...
research
02/06/2020

A Neural Approach to Ordinal Regression for the Preventive Assessment of Developmental Dyslexia

Developmental Dyslexia (DD) is a learning disability related to the acqu...
research
06/06/2023

Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics

Automatic assessment of reading fluency using automatic speech recogniti...
research
06/07/2023

An ASR-Based Tutor for Learning to Read: How to Optimize Feedback to First Graders

The interest in employing automatic speech recognition (ASR) in applicat...
research
08/29/2022

The Impact of Attending a Remedial Support Program on Syrian Children's Reading Skills: Using BART for Causal Inference

This article estimates, for a sample of 1,777 Syrian refugee children, t...
research
04/12/2021

Deep Learning for Prominence Detection in Children's Read Speech

Expressive reading, considered the defining attribute of oral reading fl...

Please sign up or login with your details

Forgot password? Click here to reset