Log In Sign Up

Unobtrusive and Multimodal Approach for Behavioral Engagement Detection of Students

by   Nese Alyuz, et al.

We propose a multimodal approach for detection of students' behavioral engagement states (i.e., On-Task vs. Off-Task), based on three unobtrusive modalities: Appearance, Context-Performance, and Mouse. Final behavioral engagement states are achieved by fusing modality-specific classifiers at the decision level. Various experiments were conducted on a student dataset collected in an authentic classroom.


page 1

page 2

page 3


Detecting Behavioral Engagement of Students in the Wild Based on Contextual and Visual Data

To investigate the detection of students' behavioral engagement (On-Task...

Data driven Decision Support on Students Behavior using Fuzzy Based Approach

Monitoring of students behavior in school needs further consideration in...

HoloBoard: a Large-format Immersive Teaching Board based on pseudo HoloGraphics

In this paper, we present HoloBoard, an interactive large-format pseudo-...

Fostering Student Engagement in a Mobile Formative Assessment System for High-School Economics

In a mobile learning environment, students can learn via mobile devices ...

Engagement Detection with Multi-Task Training in E-Learning Environments

Recognition of user interaction, in particular engagement detection, bec...

Student Engagement Detection Using Emotion Analysis, Eye Tracking and Head Movement with Machine Learning

With the increase of distance learning, in general, and e-learning, in p...

1 Introduction

Student engagement in learning is critical to achieving positive learning outcomes [4]. Fredricks et al. [7] framed student engagement in three dimensions: Behavioral, emotional, and cognitive. In this work, we focus on behavioral engagement, where we aim to detect whether a student is On-Task or Off-Task [10, 11] at any time of the learning task. Towards this end, we propose a multimodal approach for detection of students’ behavioral engagement states (i.e., On-Task vs. Off-Task), based on three unobtrusive modalities: Appearance, Context-Performance, and Mouse. Final outputs of behavioral engagement states are obtained by fusing modality-specific classifiers at the decision level.

2 Methodology

The proposed detection scheme incorporates data collected from three unobtrusive modalities: (1) Appearance: upper-body video captured using a camera; (2) Context-Performance: students’ interaction and performance data related to learning content; (3) Mouse: data related with mouse movements during the learning process. For a better evaluation of results, we analyzed the results separately for two learning tasks available: (1) Instructional, where students are watching videos; and (2) Assessment, where students are solving related questions.

Modality-specific data are fed into dedicated feature extractors [3, 9, 6]

, and the features are then classified with respective uni-modal classifiers (i.e., Random Forest Classifiers


). The decisions of separate classifiers are fused to output a final behavioral engagement state. For fusion, we propose to obtain a decision pool by incorporating all decision trees of modality-specific random forests and compute majority voting. This is equivalent to summing modality-specific confidence values and selecting the label with the highest confidence. Further details of the modalities, extracted features, and various fusion approaches we explored can be found in the full version of this paper


3 Experimental Results

Through authentic classroom pilots, data were collected while the students were consuming digital learning materials for Math on laptops equipped with cameras. In total, 113 hours of data were collected from 17 9th

graders for 13 sessions (40 minutes each), including the three unobtrusive modalities. For feature extraction, a sliding window of 8-seconds with 4-second overlap was utilized as in

[8] for each modality. The collected data were labeled using HELP [2] by three educational experts. For final ground truth labels, the windowing was also applied over three label sets, which was followed by majority voting and validity filtering.

For the classification experiments, we divided each student’s data into 80% and 20% partitions, for training and testing, respectively. In order to reduce the effect of overfitting, we conducted leave-one-subject-out cross-validation and applied 10-fold random selection to balance training sets. The uni-modal and fusion results for different learning tasks (averaged over all runs and all student) are summarized in Table 1. As these results indicate, for Instructional sections, Appearance modality performs best (0.74) due to the lack of interactions necessary for the other modalities. For Assessment, fusing all modalities yields the best performance (0.89).

Section Type Class Appr CP Ms FUSION
INSTR. On-Task 0.78 0.66 0.60 0.75
Off-Task 0.71 0.50 0.51 0.64
OVERALL 0.74 0.59 0.54 0.70
ASSESS. On-Task 0.87 0.86 0.87 0.93
Off-Task 0.66 0.22 0.64 0.73
OVERALL 0.81 0.74 0.80 0.89
Table 1: F1-measures for uni-modal models (Appr: Appearance, CP: Context-Performance, Ms: Mouse) and fusion (INSTR.: Instructional, ASSESS.: Assessment).

4 Conclusion

In summary, for behavioral engagement, we get relatively high results when only Appearance modality is used for Instructional sections whereas the fusion of all modalities yields better results in Assessment sections. The experiments also showed that it is beneficial to have context-dependent classification pipelines for different section types (i.e., Instructional and Assessment). In the light of these results, we can say that context plays an important role even when different tasks in the same vertical (i.e., learning) are considered.


  • Alyuz et al. [2017] N. Alyuz, E. Okur, U. Genc, S. Aslan, C. Tanriover, and A. A. Esme. An unobtrusive and multimodal approach for behavioral engagement detection of students. In Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, MIE 2017, pages 26–32, New York, NY, USA, 2017. ACM. ISBN 978-1-4503-5557-5. doi: 10.1145/3139513.3139521. URL
  • Aslan et al. [2017] S. Aslan, S. E. Mete, E. Okur, E. Oktay, N. Alyuz, U. E. Genc, D. Stanhill, and A. A. Esme. Human expert labeling process (help): Towards a reliable higher-order user state labeling process and tool to assess student engagement. Educational Technology, 57(1):53–59, 2017. ISSN 00131962. URL
  • Bradski and Kaehler [2000] G. Bradski and A. Kaehler. Opencv. Dr. Dobb’s journal of software tools, 3, 2000.
  • Carini et al. [2006] R. M. Carini, G. D. Kuh, and S. P. Klein. Student engagement and student learning: Testing the linkages*. Research in Higher Education, 47(1):1–32, Feb 2006. ISSN 1573-188X. doi: 10.1007/s11162-005-8150-9. URL
  • Chen et al. [2004] C. Chen, A. Liaw, and L. Breiman. Using random forest to learn imbalanced data. University of California, Berkeley, 110:1–12, 2004.
  • Christ et al. [2016] M. Christ, A. W. Kempa-Liehr, and M. Feindt. Distributed and parallel time series feature extraction for industrial big data applications. CoRR, abs/1610.07717, 2016. URL
  • Fredricks et al. [2004] J. A. Fredricks, P. C. Blumenfeld, and A. H. Paris. School engagement: Potential of the concept, state of the evidence. Review of educational research, 74(1):59–109, 2004. doi: 10.3102/00346543074001059. URL
  • Okur et al. [2017] E. Okur, N. Alyuz, S. Aslan, U. Genc, C. Tanriover, and A. Arslan Esme. Behavioral engagement detection of students in the wild. In

    International Conference on Artificial Intelligence in Education (AIED 2017)

    , volume 10331 of Lecture Notes in Computer Science, pages 250–261, Cham, June 2017. Springer International Publishing.
    ISBN 978-3-319-61425-0. doi: 10.1007/978-3-319-61425-0_21. URL
  • Pardos et al. [2014] Z. A. Pardos, R. S. Baker, M. San Pedro, S. M. Gowda, and S. M. Gowda. Affective states and state tests: investigating how affect and engagement during the school year predict end-of-year learning outcomes. Journal of Learning Analytics, 1(1):107–128, 2014.
  • Pekrun and Linnenbrink-Garcia [2012] R. Pekrun and L. Linnenbrink-Garcia. Academic emotions and student engagement. In Handbook of research on student engagement, pages 259–282. Springer, 2012.
  • Rodrigo et al. [2013] M. M. T. Rodrigo, R. Baker, L. Rossi, et al. Student off-task behavior in computer-based learning in the philippines: comparison to prior research in the usa. Teachers College Record, 115(10):1–27, 2013.