InfantNet: A Deep Neural Network for Analyzing Infant Vocalizations

05/25/2020
by   Mohammad K. Ebrahimpour, et al.
0

Acoustic analyses of infant vocalizations are valuable for research on speech development as well as applications in sound classification. Previous studies have focused on measures of acoustic features based on theories of speech processing, such spectral and cepstrum-based analyses. More recently, end-to-end models of deep learning have been developed to take raw speech signals (acoustic waveforms) as inputs and convolutional neural network layers to learn representations of speech sounds based on classification tasks. We applied a recent end-to-end model of sound classification to analyze a large-scale database of labeled infant and adult vocalizations recorded in natural settings outside the lab with no control over recording conditions. The model learned basic classifications like infant versus adult vocalizations, infant speech-related versus non-speech vocalizations, and canonical versus non-canonical babbling. The model was trained on recordings of infants ranging from 3 to 18 months of age, and classification accuracy changed with age as speech became more distinct and babbling became more speech-like. Further work is needed to validate and explore the model and dataset, but our results show how deep learning can be used to measure and investigate speech acquisition and development, with potential applications in speech pathology and infant monitoring.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2019

Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition

End-to-end neural network systems for automatic speech recognition (ASR)...
research
09/13/2017

Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

Neural models have become ubiquitous in automatic speech recognition sys...
research
03/12/2021

Learning spectro-temporal representations of complex sounds with parameterized neural networks

Deep Learning models have become potential candidates for auditory neuro...
research
12/21/2018

End-to-End Classification of Reverberant Rooms using DNNs

Reverberation is present in our workplaces, our homes and even in places...
research
05/25/2020

End-to-End Auditory Object Recognition via Inception Nucleus

Machine learning approaches to auditory object recognition are tradition...
research
05/03/2023

Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research

Modelling of early language acquisition aims to understand how infants b...
research
09/11/2020

RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications

Deep learning enables the development of efficient end-to-end speech pro...

Please sign up or login with your details

Forgot password? Click here to reset