BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

06/02/2023
by   Marvin Lavechin, et al.
0

Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels. In order to fully realize the potential of these approaches and further our understanding of how infants learn language, simulations must closely emulate real-life situations by training on developmentally plausible corpora and benchmarking against appropriate test sets. To this end, we propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels, both of which are compatible with the vocabulary typical of children's language experiences. This paper introduces the benchmark and summarizes a range of experiments showing its usefulness. In addition, we highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

Self-supervised learning (SSL) for speech representation has been succes...
research
09/30/2022

On The Robustness of Self-Supervised Representations for Spoken Language Modeling

Self-supervised representations have been extensively studied for discri...
research
09/14/2023

CiwaGAN: Articulatory information exchange

Humans encode information into sounds by controlling articulators and de...
research
11/23/2020

The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling

We introduce a new unsupervised task, spoken language modeling: the lear...
research
09/17/2023

Augmenting text for spoken language understanding with Large Language Models

Spoken semantic parsing (SSP) involves generating machine-comprehensible...
research
05/19/2023

North Sámi Dialect Identification with Self-supervised Speech Models

The North Sámi (NS) language encapsulates four primary dialectal variant...
research
02/06/2021

Child-directed Listening: How Caregiver Inference Enables Children's Early Verbal Communication

How do adults understand children's speech? Children's productions over ...

Please sign up or login with your details

Forgot password? Click here to reset