In the recent years, the field of Artificial Intelligence (AI) has seen many impressive breakthroughs. With the development of large-scale, accurate, and reliable AI systems, society is moving in the direction of AI-system integration and AI-augmentation in many spheres of life. Of particular importance among those are the areas where AI can improve the living conditions of humans and can benefit the society on the whole, i.e., where AI can be used for global good. Such applications can include, among others, the use of AI to fight poverty and hunger, address inequality and climate change, facilitate economic growth and innovation, contribute to good health and well-being, and so on. One urgent area to which AI could and should contribute is education, in particular, ensuring accessibility of high-quality education for people around the globe. In the educational domain, AI-enabled solutions can help ensure inclusive and equitable high-quality education and promote life-long learning opportunities for all in accordance with the United Nations’ Sustainable Development Goals (SDGs).111https://sdgs.un.org/goals/goal4
Unfortunately, high-quality education is not accessible to the vast majority of people around the world. There are many areas in the world where even basic school infrastructure is missing. This creates problems for in-person learning process, which have been further exacerbated by the ongoing COVID-19 pandemic [1, 5, 24]. Nevertheless, while in-person education may not be available to everyone, the growing availability of the internet and devices such as laptops and smartphones create a unique and critical opportunity to bridge the gap between those who can get access to high-quality in-person teaching and those who can’t.
One of the solutions proposed are online learning platforms providing large numbers of students with access to learning material on various subjects, with Massive Online Open Courses (MOOCs) being a notable example [33, 31]. Such platforms have the capability to address inequalities in society caused by the uneven access to in-person teaching. However, despite this, it is unclear whether these capabilities have been exploited to the full extent. There is evidence that MOOCs mostly benefit students who already have degrees and live in developed countries and, thus, may not yet be delivering on their promise to truly “democratize” education . Furthermore, student dropout rates in MOOCs often exceed , with poor interaction between the system and its users identified as being a major cause . Personalization is key to successful and effective learning . Given that MOOCs lack personalization and adaptivity, this can be identified as one of the major reasons for MOOCs not being able to provide high-quality effective education to everyone. A more powerful solution among computer-based learning environments (CBLEs), which can provide students with a scalable, high-quality alternative is AI-powered Intelligent Tutoring Systems (ITS) [25, 10, 17].
In this paper, we evaluate online learning and put the assumptions around their learning efficacy to the test. We conduct a comparative head-to-head study on learning outcomes for two popular online learning platforms. Among these, the MOOC platform follows a traditional model delivering content over a series of lecture videos and multiple-choice quizzes, while the Korbit platform222Korbit is available at www.korbit.ai.
provides a highly personalized, active learning experience involving problem-solving exercises and personalized feedback. Learning outcomes are measured on the basis of pre- and post-assessment quizzes with participants taking courses on an introductory data science topic on the two platforms. As a result, we observe a statistically significant increase in the learning outcomes, with students onKorbit providing full feedback achieving learning gains - times higher than both students on the MOOC platform and a control group of students who don’t receive personalized feedback on the Korbit platform. In addition, we find that students learning on the highly personalized, active learning Korbit platform achieve higher course completion rates and are more motivated than all other groups of students. The major contribution of this paper is the demonstration that personalized, active learning AI-powered systems based on problem-solving exercises and personalized feedback of the type deployed on Korbit can have a tremendous impact on the learning experience and substantially improve learning efficacy. If such a personalized and active learning experience can be made available to millions of learners around the world, then this advance would represent a significant step towards the democratization of education.
2.1 Personalization in Computer-based Learning Environments
In his seminal work, Bloom 
demonstrated that personalised one-on-one tutoring results in learning gains as high as 2 standard deviations (), which means that the average tutored student scores better than of the students that do not receive one-on-one tutoring. Clearly, one-on-one tutoring is too costly for most societies to provide their citizens with, so in practice this target is not achieved. A viable low-cost alternative to in-person one-on-one tutoring is Intelligent Tutoring Systems (ITS), which are “computer-based instructional systems with models of instructional content that specify what to teach, and teaching strategies that specify how to teach” . Such systems attempt to mimic personalized human tutoring in a computer-based environment [4, 23].
ITS have been shown to be able to address such personalization factors as individual student characteristics  and cognitive processes , provide personalized feedback and interactions, or even develop personalized curriculum and generate personalized feedback [8, 19, 27, 28, 7, 21, 2, 22, 20, 14, 3]. Moreover, previous research suggests that in some domains ITS may have already surpassed non-expert human tutors and matched expert ones [32, 9].
2.2 Korbit Personalized Learning Platform
In this paper, we investigate the impact of personalization on the quality of education in the context of AI-powered learning. We experiment with the Korbit learning platform, which is an AI-powered, large-scale, dialogue-based ITS. Korbit has trained over
learners online with highly personalized, active and practical learning experience across a range of subjects and skills. The platform uses a fully-automated system based on a suite of machine learning (ML), natural language processing (NLP), and reinforcement learning (RL) models aimed to provide deeply personalized and interactive learning online. The platform currently hosts a number of courses and skills around software development, data science, machine learning, and artificial intelligence. It is highly modular, scalable, and is capable of customizing the curriculum for each individual student and adapting in real-time the content to the students’ levels of proficiency and needs. Moreover, the ML, NLP, and RL models at the core ofKorbit continue to learn online in real-time based on live interactions with students, automatically adjusting themselves to new content and learning activities .
To achieve deep personalization and interactivity, Korbit alternates between teaching via video lectures, Socratic tutoring, interactive problem-solving exercises, coding exercises and project-based learning.
The focus of this study is on the problem-solving sessions. Here students are presented with problem statements (e.g., questions), whereupon the student can attempt to solve the exercise, ask for help, or skip the exercise. If the student attempts to solve the exercise, their solution attempt is compared by an NLP-driven solution verification module against the expectation stored internally in the database (i.e. reference solution, which typically consists of one or two sentences containing all relevant information that should be included in the correct answer to the question posed). Figure 1 presents an example of an interactive dialogue between Korbi, the AI tutor, and a student.
In a full interactive, personalized model, if a student’s solution is incorrect, the system responds with one of a dozen different pedagogical interventions to help students arrive at the correct solution to the problem. Such pedagogical interventions on the Korbit platform include, among others, hints, explanations, elaborations, mathematical hints, concept tree diagrams, and multiple choice quiz answers. The type and the levels of difficulty for each pedagogical intervention is chosen by RL models based on the student’s learning profile and previous solution attempts. This helps ensure that the content and the learning experience are highly personalized and adapted to each particular student. In addition to questions, reference solutions, and pedagogical interventions created manually by curriculum designers for the Korbit platform, many questions and hints are further automatically generated using ML and NLP models [12, 16, 18]. The Korbit platform is highly scalable and has the ability to parse vasts amounts of open educational resources (OER), automatically generalize to new subjects and improve in real-time as it interacts with new students.
The Korbit platform applies a number of personalization strategies: once a learner is registered on the platform, they are presented with a short series of knowledge and skills assessments aimed at identifying their level of expertise and providing them with the most suitable personalized learning path.
Once the level of the learner is identified, Korbi, the AI tutor, tailors the learning materials and the path based on the learner’s understanding of key concepts, time availability, and learning objectives. The level of difficulty and the learning pace are constantly adapted to the learner’s needs based on their performance throughout the learning path. Figure 2 provides an example of a personalized learning path.
In addition to the personalization strategies, Korbi promotes active and skill-based learning. Questions, exercises and projects on the platform encourage learners to constantly apply their knowledge in practice. The platform offers coding exercises and programming projects that allow learners to translate their declarative knowledge into procedural knowledge (see Figure 3 for an example of a coding exercise). A recent user study demonstrated that such personalized, active approach to learning leads to significantly improved learning gains as well as higher meta-cognition in learners studying with Korbit .
3 Experimental Setup
In this study, we aim to investigate the effect that personalization has on the quality of education provided by online learning systems. This section details our experimental setup.
This study was run in collaboration with an industrial partner: specifically, to investigate the impact of personalization provided by AI in online and distance learning, we have partnered with an information technology company in Vietnam. Employees of this company have strong programming skills but needed upskilling with respect to their data science and machine learning knowledge and practical skills.
Over software developers from the company were originally offered an opportunity to participate in the study and were asked to fill in a short enrolment questionnaire, specifying their level of expertise in data science. As a result, employees with no background knowledge of data science were selected and provided access to a -hour long course on linear regression on one of the two online platforms. Prior to starting the course, the participants were asked to do a pre-assessment quiz. Then, upon completion of the course on the correspondent platform, participants were asked to do a post-assessment quiz, and their learning outcomes were measured based on the difference in their scores. The experiment was run completely online. Participants who finished either course were rewarded with a completion certificate awarded jointly by us and the employer regardless of their scores, which provided them with an incentive at the company level to participate in the study.
The analysis of the results was performed on the set of participants who qualified for the study, completed it and submitted answers to both pre- and post-assessment quizzes. As a result, we analyzed the submissions from study participants.
3.2 Learning Platforms
Participants, who qualified for the study based on their (lack of) prior experience with data science, were randomly split between two platforms – a MOOC platform and the Korbit learning platform. In addition, to highlight the impact of feedback personalization and remove any other factors that may stem from the differences between the two platforms that are not directly related to personalization, we ran experiments on Korbit under two settings – the full interactive personalization mode, and a more limited interactive mode with no personal feedback provided to the students by the AI-tutor. Finally, we also set a threshold of skipped exercises, meaning that a participant who skipped more than of the exercises on the Korbit platform would not be considered in the analysis. As a result, we ended up with the sample size of participants in total, split between the following three groups:
MOOC is a widely popular learning platform with millions of learners that follows a traditional model for online and distance-learning courses: students on this platform learn by watching lecture videos, reading, and testing their knowledge with multiple choice quizzes. As such, this platform does not provide any personalization to students under any settings. This groups consisted of participants.
Korbit (full) is an AI-powered learning platform, which relies on machine learning models to adapt the learning process to students and their performance in real time and to provide them with personalized distance-learning education, as detailed in Section 2.2. The group of students allocated to this group got access to the full-functioning system with a pre-defined curriculum matching the MOOC course. This group consisted of participants.
Korbit (no feedback): this group got access to a special variant of Korbit with a pre-defined curriculum matching that of the MOOC course, but in this case, Korbit was not providing any feedback except saying if students’ answers were “CORRECT” or “INCORRECT”. Although we realize that students in this group were more frustrated overall as a result of this limitation, this was an important comparison point which helped us evaluate the impact of the personalized feedback. This group consisted of participants.
Figure 4 visualizes the interaction on the platforms for the three groups of students.
3.2.1 Selection of the Material
One of the assumptions we made about the study process on the two platforms was that students benefit from personalized education. To test this assumption, we selected a set of participants, who had no or little prior experience with the topics and offered them to undertake a course on an introductory topic in data science. Linear regression
was selected as the topic of study on both online platforms as such introductory topic: besides being one of the most fundamental topics in data science, it is typically covered early on in any course on the subject. To make sure the experiment provides for a fair comparison, we checked that the material covering this topic, as well as its difficulty level and the length of the courses on the two platforms are comparable and carefully aligned. Specifically, the courses on both platforms covered such sub-topics as numerical variables, correlation, residuals, least squares regression, and evaluation metrics, among other sub-topics.
Both the original course on linear regression on MOOC and the adapted course on Korbit consisted of short lecture videos on the subject, followed by multiple-choice questions in the case of MOOC, and interactive problem-solving exercises in the case of Korbit, either with a full interactive dialogue for Korbit (full) or just the assessment of the answers for Korbit (no feedback) The course on each platform took approximately 3 hours to complete.
3.2.2 Study Flow
The study participants were recruited from the employees of the company in need of upskilling. Upon being selected for the study, they received an email with instructions about how to complete the study. They were asked to fill in the enrolment questonnaire to check their eligibility (only employees with little knowledge of data science were qualified to participate in the study). Then, eligible study participants were asked to complete a pre-assessment quiz on linear regression on TypeForm.333https://www.typeform.com Once the participants were allocated to one of the study groups, they had one week to complete the -hour course on the respective online platform. Upon completion of the course, they were asked to compete a post-assessment quiz on TypeForm. At the end of the study, all participants who completed it received an award (a completion certificate) regardless of their assessment scores.
Using pre- and post-quiz scores, we measured learning gains to quantify how efficiently each participant has learned. The pre- and the post-quizzes both consisted of multiple-choice questions, which were equally adapted to both courses. This was done in order to make sure that any topic mentioned in the quizzes was covered on both platforms to a similar extent, which means that participants that were using either of the platforms were expected to be able to successfully answer such questions.
We also ran a number of checks, and the quizzes passed a series of independent reviews to make sure that the questions were not inherently biased towards either of the two platforms in any way. Finally, questions of the pre-quiz were isomorphically paired with questions in the post-quiz to make sure that the difficulty of the two quizzes was as similar as possible without any questions being identical. This allowed us to measure the learning gains in a fair and unbiased way.
3.2.3 Expected Outcomes
We set out to test the following two hypotheses:
Hypothesis 1: Korbit (full) results in higher learning outcomes than both MOOC and Korbit (no feedback), which demonstrates that personalized feedback provides a more effective online learning experience.
Hypothesis 2: Korbit (no feedback) results in higher learning outcomes than MOOC, which demonstrates that problem-solving exercises provide a more effective online learning experience.
3.2.4 Learning Gains
To evaluate which of the three learning setups teaches the participants more effectively, we compare MOOC, Korbit (full), and Korbit (no feedback) using average learning gains and normalized learning gains  of the participants under each setting. A student’s learning gain
is calculated as the difference between their score on the post-quiz and on the pre-quiz using the following estimate:
with being the student’s score on the post-quiz, and is their score on the pre-quiz. Both scores fall in the interval . A student’s individual normalized learning gain is calculated by offsetting a particular student’s learning gain against the score range in the ideal scenario in which a student achieves a score of in the post-quiz:
4 Results and Discussion
In this section we summarize the results of our experiments, in particular by investigating the learning gains, study time and completion rates for each platform. Higher learning gains show that the students benefit more from the learning process, while higher study time combined with a higher completion rate demonstrate higher levels of student engagement in the learning process.
4.1 Student engagement
Study time We first observe that participants on Korbit (full) spend significantly more time studying and interacting with the platform. Specifically:
Study time Korbit (full): min.
Study time Korbit (no feedback): min.
This result is statistically significant at confidence level ().
We note that the participants in these two groups use the same platform with respect to the study material (e.g., videos) and exercises. The core difference between these two versions of the platform is the amount of feedback that the participants get. In the case of Korbit (no feedback), the participants are only notified whether their answer to the question is correct or not, but are not provided with any further explanation. Note, that we do not have access to MOOC for this metric and, therefore, cannot compare these results directly. However, since MOOC does not provide students with active learning and personalized education, we would expect to see the results for MOOC on a par with Korbit (no feedback).
4.1.1 Completion rates
Next, we estimate completion rates for the material on each platform. This is defined as the proportion of the number of participants who completed the full course with a skip rate below the threshold of to the number of participants who completed the pre-quiz and signed up on a platform for their group. We observe the following course completion rates:
Completion rate Korbit (full):
Completion rate Korbit (no feedback):
Completion rate MOOC:
These results are significant with for Korbit (full) () vs Korbit (no feedback) (), and for Korbit (full) () vs MOOC (). Combined with the results above, this demonstrates that the difference in the amount of feedback is the main reason for the higher completion rate. This is an important piece of evidence demonstrating the impact of feedback on the online study process.
4.2 Learning Outcomes
Raw Learning Gains
We report average learning gains in Figure 5 for the three study groups on the two learning platforms.444confidence intervals (C.I.) are estimated as:
. With respect to the raw learning gains , we observe that both MOOC and Korbit (no feedback) show very similar learning gains of –. Both groups of participants, who used these platforms, were presented with exercises but not given feedback on their performance. At the same time, we observe that the learning gains on Korbit (full), which provided participants with personalized feedback and explanations relevant to their solutions, are around higher than on MOOC with confidence (=), and around higher than on Korbit (no feedback) with confidence (=).
4.2.1 Normalized Learning Gains
We observe a similar trend with respect to the normalized learning gains: the average normalized learning gains for Korbit (full) participants are times higher than the average normalized gains for MOOC participants (with the difference being statistically significant at the confidence level (=). The key difference in these results compared to the raw learning gains is that Korbit (full) contributes to higher learning gains than Korbit (no feedback), although the difference in this case is smaller: the improvement is with =.
The results presented positively confirm both of our two hypotheses. The results demonstrate that learning outcomes are better for participants on Korbit (full) than participants on either of the platforms that do not provide personalized feedback. This finding is confirmed by both the average learning gains and normalized learning gains , with these scores being substantially higher for Korbit (full) than the other two platforms, with the difference in most cases being statistically significant at confidence level. Further, the results presented demonstrate that learning outcomes are better for participants on Korbit (no feedback) than participants on MOOC, which shows the impact of active learning based on problem-solving exercises. These results, combined with our earlier observations on increased study time, suggest that personalization and active learning elements on Korbit (full) contribute significantly to learners’ experience with online platforms.
Artificial intelligence (AI) has contributed to the improvement of our society in many domains. One domain which could and should benefit from the application of AI algorithms is education. Unfortunately, high-quality in-person teaching is not available to the majority of people around the world. Even in the developed countries, it is currently not available to an increasingly large number of students due to the ongoing pandemic. AI-powered online learning platforms can help bridge this gap. This implies that AI researchers must focus their efforts on the application of AI to improve the quality of education, in particular in the context of online learning systems. ITS (intelligent tutoring systems) are a viable, scalable solution that can provide high-quality education, substantially surpassing the capabilities of platforms that do not allow for personalization and active learning.
In this paper, we have argued for the effectiveness of personalization and active learning strategies in ITS. Based on experimental evidence with nearly 200 students, we have shown that personalization and active learning in ITS can have a huge impact on both student learning outcomes and student motivation. We have observed that learning gains are to times higher in an ITS compared to online learning platforms without personalization and active learning. We have observed that students are considerably more engaged in the learning process on the platform that provides highly personalized, active learning experience. The latter is evidenced both by longer study time on the platform providing such personalized experience and by higher completion rates as compared to the alternatives that do not provide interactive, personalized education.
These results together provide strong evidence of the tremendous impact that can be achieved with a personalized, active learning AI-powered system based on problem-solving exercises and personalized feedback. If such a personalized and active learning experience can be made accessible to millions of learners around the world, then this advance would represent a huge leap forward towards the democratization of education in accordance with the United Nations’ Sustainable Development Goals.
-  (2020) Covid-19 pandemic and online learning: the challenges and opportunities. Interactive learning environments, pp. 1–13. Cited by: §1.
-  (2019) The Impact of Student Model Updates on Contingent Scaffolding in a Natural-Language Tutoring System. In International Conference on Artificial Intelligence in Education, pp. 37–47. Cited by: §2.1.
-  (2021) AI and machine learning techniques in the development of intelligent tutoring system: a review. In 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), pp. 403–410. Cited by: §2.1.
-  (1985) Intelligent tutoring systems. Science 228 (4698), pp. 456–462. Cited by: §2.1.
-  (2020) Transition to online education in schools during a SARS-CoV-2 coronavirus (COVID-19) pandemic in Georgia.. Pedagogical Research 5 (4). Cited by: §1.
-  (1984) The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational researcher 13 (6), pp. 4–16. Cited by: §1, §2.1.
-  (2018) Automatic Curriculum Generation Applied to Teaching Novices a Short Bach Piano Segment. In NeurIPS Demonstrations, Cited by: §2.1.
-  (2011-01) Instructional Factors Analysis: A Cognitive Model For Multiple Instructional Interventions. In EDM 2011 - Proceedings of the 4th International Conference on Educational Data Mining, pp. 61–70. Cited by: §2.1.
-  (2012) Intelligent tutoring systems.. APA educational psychology handbook, Vol 3: Application to learning and teaching, pp. 451–473. Cited by: §2.1.
-  (2001) Intelligent tutoring systems with conversational dialogue. AI magazine 22 (4), pp. 39–39. Cited by: §1.
-  (2017) Assessment with computer agents that engage in conversational dialogues and trialogues with learners. Computers in Human Behavior 76, pp. 607 – 616. External Links: Cited by: §2.1.
-  (2021) Deep discourse analysis for generating personalized feedback in intelligent tutor systems. In The 11th Symposium on Educational Advances in Artificial Intelligence, Cited by: §2.2.
-  (1998) Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American journal of Physics 66 (1), pp. 64–74. Cited by: §3.2.4.
-  (2022) An intelligent tutoring system for improving adult literacy skills in digital environments. Technical report EasyChair. Cited by: §2.1.
-  (2016) Exploring the factors affecting MOOC retention: A survey study. Computers & Education 98, pp. 157–168. Cited by: §1.
-  (2021) Automated data-driven generation of personalized pedagogical interventions in intelligent tutoring systems. International Journal of Artificial Intelligence in Education, pp. 1–27. Cited by: §2.2.
-  (2006) Cognitive tutors: Technology bringing learning sciences to the classroom. Cited by: §1.
-  (2021) Back-training excels self-training at unsupervised domain adaptation of question generation and passage retrieval. arXiv preprint arXiv:2104.08801. Cited by: §2.2.
Data mining for providing a personalized learning path in creativity: An application of decision trees. Computers & Education 68, pp. 199 – 210. External Links: Cited by: §2.1.
-  (2016) STI-dico: a web-based its for fostering dictionary skills and knowledge. In European Conference on Technology Enhanced Learning, pp. 416–421. Cited by: §2.1.
-  (2018) Combining adaptivity with progression ordering for intelligent tutoring systems. In Proceedings of the 5th annual ACM conference on Learning@ scale, pp. 1–4. Cited by: §2.1.
-  (2019) Personalization in OELEs: Developing a Data-Driven Framework to Model and Scaffold SRL Processes. In International Conference on Artificial Intelligence in Education, pp. 354–358. Cited by: §2.1.
-  (2014) AutoTutor and family: A review of 17 years of natural language tutoring. International Journal of Artificial Intelligence in Education 24 (4), pp. 427–469. Cited by: §2.1.
-  (2020) Impact of Coronavirus pandemic on education. Journal of Education and Practice 11 (13), pp. 108–121. Cited by: §1.
-  (1988) Intelligent tutoring systems: Lessons learned. Psychology Press. Cited by: §1.
-  (2017) Participation patterns in a massive open online course (MOOC) about statistics. British Journal of Educational Technology 48 (6), pp. 1295–1304. Cited by: §1.
-  (2014) Macro-adaptation in conversational intelligent tutoring matters. In International Conference on Intelligent Tutoring Systems, pp. 242–247. Cited by: §2.1.
-  (2014) DeepTutor: towards macro-and micro-adaptive conversational intelligent tutoring at scale. In Proceedings of the 1st ACM conference on Learning@ scale, pp. 209–210. Cited by: §2.1.
-  (2020) A large-scale, open-domain, mixed-interface dialogue-based its for stem. In International Conference on Artificial Intelligence in Education, pp. 387–392. Cited by: §2.2.
-  (2021) A comparative study of learning outcomes for online learning platforms. In International Conference on Artificial Intelligence in Education, pp. 331–337. Cited by: §2.2.
-  (2016) Predicting Post-Test Performance from Online Student Behavior: A High School MOOC Case Study.. International Educational Data Mining Society. Cited by: §1.
-  (2011) The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems. Educational psychologist 46 (4), pp. 197–221. Cited by: §2.1.
A longitudinal study on learner career advancement in MOOCs. Journal of Learning Analytics 1 (3), pp. 203–206. Cited by: §1.
-  (1987) Artificial Intelligence and Tutoring Systems. Los Altos: Morgan Kaufmann. Cited by: §2.1.
-  (2015) MOOCs in the developing world: Hope or hype?. International Higher Education (80), pp. 23–25. Cited by: §1.
-  (2010) Agent Prompts: Scaffolding Students for Productive Reflection in an Intelligent Learning Environment. In Intelligent Tutoring Systems, V. Aleven, J. Kay, and J. Mostow (Eds.), pp. 426–428. External Links: Cited by: §2.1.