Fine-Grained Classroom Activity Detection from Audio with Neural Networks

07/29/2021
by   Eric Slyman, et al.
5

Instructors are increasingly incorporating student-centered learning techniques in their classrooms to improve learning outcomes. In addition to lecture, these class sessions involve forms of individual and group work, and greater rates of student-instructor interaction. Quantifying classroom activity is a key element of accelerating the evaluation and refinement of innovative teaching practices, but manual annotation does not scale. In this manuscript, we present advances to the young application area of automatic classroom activity detection from audio. Using a university classroom corpus with nine activity labels (e.g., "lecture," "group work," "student question"), we propose and evaluate deep fully connected, convolutional, and recurrent neural network architectures, comparing the performance of mel-filterbank, OpenSmile, and self-supervised acoustic features. We compare 9-way classification performance with 5-way and 4-way simplifications of the task and assess two types of generalization: (1) new class sessions from previously seen instructors, and (2) previously unseen instructors. We obtain strong results on the new fine-grained task and state-of-the-art on the 4-way task: our best model obtains frame-level error rates of 6.2 unseen instructors for the 4-way, 5-way, and 9-way classification tasks, respectively (relative reductions of 35.4 baseline). When estimating the aggregate time spent on classroom activities, our average root mean squared error is 1.64 minutes per class session, a 54.9 relative reduction over the baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 7

page 8

page 9

research
06/07/2023

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

In recent years, self-supervised learning (SSL) has emerged as a popular...
research
05/14/2021

Learning Group Activities from Skeletons without Individual Action Labels

To understand human behavior we must not just recognize individual actio...
research
03/16/2019

Non-intrusive speech quality assessment using neural networks

Estimating the perceived quality of an audio signal is critical for many...
research
12/02/2020

A Study of Few-Shot Audio Classification

Advances in deep learning have resulted in state-of-the-art performance ...
research
07/12/2020

Fine-grained Language Identification with Multilingual CapsNet Model

Due to a drastic improvement in the quality of internet services worldwi...
research
07/13/2016

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

Environmental audio tagging aims to predict only the presence or absence...

Please sign up or login with your details

Forgot password? Click here to reset