Imputing Knowledge Tracing Data with Subject-Based Training via LSTM Variational Autoencoders Frameworks

02/24/2023
by   Jia Tracy Shen, et al.
0

The issue of missing data poses a great challenge on boosting performance and application of deep learning models in the Knowledge Tracing (KT) problem. However, there has been the lack of understanding on the issue in the literature. address this challenge, we adopt a subject-based training method to split and impute data by student IDs instead of row number splitting which we call non-subject based training. The benefit of subject-based training can retain the complete sequence for each student and hence achieve efficient training. Further, we leverage two existing deep generative frameworks, namely variational Autoencoders (VAE) and Longitudinal Variational Autoencoders (LVAE) frameworks and build LSTM kernels into them to form LSTM-VAE and LSTM LVAE (noted as VAE and LVAE for simplicity) models to generate quality data. In LVAE, a Gaussian Process (GP) model is trained to disentangle the correlation between the subject (i.e., student) descriptor information (e.g., age, gender) and the latent space. The paper finally compare the model performance between training the original data and training the data imputed with generated data from non-subject based model VAE-NS and subject-based training models (i.e., VAE and LVAE). We demonstrate that the generated data from LSTM-VAE and LSTM-LVAE can boost the original model performance by about 50 original model just needs 10 performance if the prediction model is small and 50% more data if the prediction model is large with our proposed frameworks.

READ FULL TEXT

page 6

page 7

research
02/23/2020

Variance Loss in Variational Autoencoders

In this article, we highlight what appears to be major issue of Variatio...
research
03/02/2022

Learning Conditional Variational Autoencoders with Missing Covariates

Conditional variational autoencoders (CVAEs) are versatile deep generati...
research
01/13/2022

Reproducible, incremental representation learning with Rosetta VAE

Variational autoencoders are among the most popular methods for distilli...
research
07/07/2021

Modality Completion via Gaussian Process Prior Variational Autoencoders for Multi-Modal Glioma Segmentation

In large studies involving multi protocol Magnetic Resonance Imaging (MR...
research
02/09/2020

Out-of-Distribution Detection with Distance Guarantee in Deep Generative Models

Recent research has shown that it is challenging to detect out-of-distri...
research
06/25/2023

Masked conditional variational autoencoders for chromosome straightening

Karyotyping is of importance for detecting chromosomal aberrations in hu...
research
03/12/2021

Medical data wrangling with sequential variational autoencoders

Medical data sets are usually corrupted by noise and missing data. These...

Please sign up or login with your details

Forgot password? Click here to reset