Towards an Integrative Educational Recommender for Lifelong Learners

12/03/2019 ∙ by Sahan Bulathwela, et al. ∙ UCL 7

One of the most ambitious use cases of computer-assisted learning is to build a recommendation system for lifelong learning. Most recommender algorithms exploit similarities between content and users, overseeing the necessity to leverage sensible learning trajectories for the learner. Lifelong learning thus presents unique challenges, requiring scalable and transparent models that can account for learner knowledge and content novelty simultaneously, while also retaining accurate learners representations for long periods of time. We attempt to build a novel educational recommender, that relies on an integrative approach combining multiple drivers of learners engagement. Our first step towards this goal is TrueLearn, which models content novelty and background knowledge of learners and achieves promising performance while retaining a human interpretable learner model.



There are no comments yet.


page 1

page 2

page 3

Code Repositories


This repository contains the VLEngagement dataset and the helper functions/ tools required to work with the dataset.

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


As the world population grows, more innovative approaches should be sought to provide high quality lifelong learning education opportunities to people of diverse cultures, languages, age groups and backgrounds. Machine learning now promises to provide such benefits of personalised teaching to anyone in the world cost effectively.

Since learner engagement is a prerequisite for achieving impactful learning outcomes [11], we attempt to build a recommender system that models different drivers of engagement, assisting learners on their personal learning trajectory to achieve their learning goals. Our approach differs from previous work in that it (i) incorporates different drivers of engagement such as resource quality, novelty, learner knowledge and interests; (ii) matches learners to useful and engaging fragments of knowledge, as opposed to lengthy full resources; and (iii) supports a multi-lingual and multi-modal collection of learning resources.

Related Work

Conventional recommendation systems that exist today mainly focus on exploiting user interests. On the contrary, educational recommenders face different challenges as a successful educational recommender ought to satisfy additional functionalities, that stem from attempting to bring learners closer to their goals effectively. Some additional features worth mentioning are accounting for the novelty of materials [8] and identifying sensible learning trajectories. Although handcrafting learning trajectories [1] is an option, such an approach is highly domain specific and lacks scalability. Similarly, handcrafting the Knowledge Components (KCs) (or topics/concepts) present in a resource also poses similar drawbacks, which motivate the need for an automatic, domain-agnostic entity linking algorithm. Incorporating these additional features to the system envisages i) detecting learners interests and goals, as these can significantly affect their motivation [15]; ii) detecting the current knowledge state of learners, the topics covered in a resource and the prerequisites necessary for benefiting from a learning material [1]; iii) recommending novel and relevant educational resources; and iv) accounting for how different content features of a resource impact how engaging a resource is [5, 9].

The majority of work in adaptive educational systems builds on Item Response Theory (IRT) [14, 13] and Knowledge Tracing (KT) [16]

that focus on estimating learner’s knowledge for a narrow set of skills based on test answers. The work focusing on modelling a wide spectrum of skills over longer periods of time, which is our main focus, is surprisingly scarse.

While excelling on the personalisation front, there are other features that are often overlooked when designing educational recommendation systems. We design our system with these features in mind: (i) Cross-modality (e.g. video, text, audio etc.) and (ii) cross-linguality are vital to identifying and recommending educational resources across different modalities and languages. In a lifelong learning setting, these two features will allow matching learning resources to the most suitable learners that come from various backgrounds. (iii) Transparency empowers learners by building trust while supporting the learner’s metacognition processes, such as planning, monitoring and reflection [6]. (iv) Scalability and (v) data efficiency allows maintaining the states of large masses of learners over longer periods of time while making the best use of available user signals, such as implicit engagement [15].

Figure 1: (i) Graphical model representing learner engagement (dashed arrows indicating the components tested) and (ii) TrueLearn factor graph (also, the part with dashed arrows in (i)), integrating resource topics (), current knowledge () and novelty () to predict engagement (output factor). is a dynamic factor of learner indicating the engagement margin with respect to the amount of novelty. Plates represent top ranked Wikipedia topics.

Our Approach

We identify four factors that influence learners’ engagement and develop a probabilistic graphical model that aims to recover those hidden variables using implicit engagement signals. Using a graphical model that learns from implicit engagement allows us to infer these hidden variables without compromising learner experience through excessive explicit user interventions. The identified factors are: i) baseline resource quality (), how engaging a resource is for the average learner; ii) background knowledge of the learner (); iii) novelty of the learning material (); and iv) curiosity or learning goals () of the learner as outlined in Figure 1. As a first step, we reformulate the IRT TrueSkill algorithm [10], to model learner knowledge and novelty as a function of engagement (dashed arrows in Figure 1 (i)).

TrueSkill has several features that make it an excellent starting point. It is a scalable and online algorithm that shares similarities with our problem and provides a good framework for embedding novelty and a dynamic learner factor (that accounts for knowledge changing over time). TrueSkill algorithm and its successor, TrueSkill 2 [12], have been deployed and time-tested with millions of users playing multiplayer video games in the Microsoft Xbox Live system giving substantial evidence of its scalability. The TrueSkill framework also provides a method to address dynamic factor involved in learning how the knowledge state of players changes over time [7]. The Gaussian skill parameter in TrueSkill, when used with a humanly interpretable knowledge component space (e.g. the Wikipedia topics covered in a resource), provides an intuitive and transparent knowledge representation. We propose several reformulations of TrueSkill in [4], which we name TrueLearn. We also propose in [4] a reformulation of Knowledge Tracing to our problem, demonstrating however in a large dataset the superiority of TrueSkill inspired algorithms.


We construct a dataset from the popular video lectures repository VideoLectures.Net (VLN). Since handcrafting the Knowledge Components (KCs) in a resource is not scalable, we use an automatic entity linking algorithm, known as Wikification [3]. The English transcription of the lecture (or the English translation) is used to annotate the lecture with the 5 most relevant knowledge components using a Wikipedia text ontology through Wikifier [3]. This allows us to work with multiple languages and modalities and automatise the extraction of KCs. We divide the lecture text into multiple fragments of approximately 5,000 characters (equivalent roughly to 5 minutes of lecture) before Wikification. The engagement label is computed by calculating the normalised watch time [9]. The final dataset consists of 18,933 unique learners.


We implement four baseline models to compare TrueLearn against: i) Naïve persistence, which assumes a static behaviour for all users, i.e. if the learner is engaged, they will remain engaged and vice versa; ii) Naïve majority, which predicts future engagement based solely on mean past engagement of users; iii) KT model (Multi-Skill KT) according to [2]; and iv) Vanilla TrueSkill [10].

Algorithm F1-Score
Naïve persistence 0.629
Naïve majority 0.640
Vanilla TrueSkill 0.400
Multi skill KT 0.259
TrueLearn 0.677
Table 1: Mean F1-Score with the full VLN dataset


The results in Table 1 show evidence that TrueLearn outperforms the baselines while retaining a transparent learner model. The model is run per learner and trained in an online fashion, thus being scalable. The next step is to model content quality and learner curiosity within the same framework. Exploration into future user interfaces for learning with lecture fragments and ways to planning learning trajectories and recommending material are also timely.


This research is conducted as part of the X5GON project ( funded from the EU’s Horizon 2020 research and innovation programme grant No 761758 and partially funded by the EPSRC Fellowship titled ”Task Based Information Retrieval”, under grant No EP/P024289/1.


  • [1] K. Bauman and A. Tuzhilin (2018) Recommending remedial learning materials to students by filling their knowledge gaps. MIS Quarterly 42 (1), pp. 313–332. Cited by: Related Work.
  • [2] C. Bishop, J. Winn, and T. Diethe (2015-05) Model-based machine learning. Note: Accessed: 2019-05-23 Cited by: Models:.
  • [3] J. Brank, G. Leban, and M. Grobelnik (2017) Annotating documents with relevant wikipedia concepts. In Proc. of Slovenian KDD Conf. on SiKDD, Cited by: Data:.
  • [4] S. Bulathwela, M. Perez-Ortiz, E. Yilmaz, and J. Shawe-Taylor (2020) TrueLearn: a family of bayesian algorithms to match lifelong learners to open educational resources. In

    Proc. of the 2020 AAAI Conf. on Artificial Intelligence

    Cited by: Our Approach.
  • [5] S. Bulathwela and J. Shawe-Taylor (2019) Towards Automatic, Scalable Quality Assurance in Open Education. Note: Cited by: Related Work.
  • [6] S. Bull and J. Kay (2016) SMILI☺: a framework for interfaces to learning data in open learner models, learning analytics and related fields. IJAIED 26 (1), pp. 293–331. Cited by: Related Work.
  • [7] P. Dangauthier, R. Herbrich, T. Minka, and T. Graepel (2008) TrueSkill through time: revisiting the history of chess. In Advances in NIPS 20, pp. 337–344. Cited by: Our Approach.
  • [8] H. Drachsler, H. G. K. Hummel, and R. Koper (2008) Personal recommender systems for learners in lifelong learning networks: the requirements, techniques and model. Int. J. Learn. Technol. 3 (4), pp. 404–423. Cited by: Related Work.
  • [9] P. J. Guo, J. Kim, and R. Rubin (2014) How video production affects student engagement: an empirical study of mooc videos. In Proc. of the First ACM Conf. on L@S, Cited by: Related Work, Data:.
  • [10] R. Herbrich, T. Minka, and T. Graepel (2007) TrueSkill(tm): a bayesian skill rating system. In Advances in NIPS 19, pp. 569–576. External Links: Link Cited by: Models:, Our Approach.
  • [11] A. S. Lan, C. G. Brinton, T. Yang, and M. Chiang (2017) Behavior-based latent variable model for learner engagement.. In Proc. of Int. Conf. on EDM, Cited by: Introduction.
  • [12] T. Minka, R. Cleven, and Y. Zaykov (2018-03) TrueSkill 2: an improved bayesian skill rating system. Technical report Microsoft Research. External Links: Link Cited by: Our Approach.
  • [13] R. Pelánek, J. Papoušek, J. Řihák, V. Stanislav, and J. Nižnan (2017-03-01) Elo-based learner modeling for the adaptive practice of facts. User Modeling and User-Adapted Interaction 27 (1), pp. 89–118. Cited by: Related Work.
  • [14] G.E. Rasch (1960) Probabilistic models for some intelligence and attainment tests. Vol. 1. Cited by: Related Work.
  • [15] M. Salehi, I. Nakhai Kamalabadi, and M. B. Ghaznavi Ghoushchi (2014) Personalized recommendation of learning material using sequential pattern mining and attribute based collaborative filtering. Education and Information Technologies 19 (4), pp. 713–735. Cited by: Related Work, Related Work.
  • [16] M. V. Yudelson, K. R. Koedinger, and G. J. Gordon (2013) Individualized bayesian knowledge tracing models. In IJAIED, pp. 171–180. Cited by: Related Work.