Open Educational Resources (OERs) play a key role in informal education these days. There are many OER repositories (e.g., MIT111https://ocw.mit.edu/, edX222https://www.edx.org/, Khan Academy333https://www.khanacademy.org/) launching millions of OERs under Creative Common license444https://creativecommons.org/. However, the lack of high quality services such as OER search and recommendation systems limit the use of OERs . In order to provide such services, high-quality metadata that describe OERs thoroughly are needed. Although most of the OER repositories are using standardized metadata definitions (e.g., IEEE Standard for Learning Object Metadata (LOM) , Learning Resource Metadata Initiative (LRMI) ) to improve open educational services, the unavailability and low-quality of metadata still limit the performance of these services , .
Furthermore, a large amount of OERs is provided by content creators around the world everyday. These OERs vary in terms of levels of education or vocation, and come in large numbers of different formats and languages. Therefore, it has become inevitable to put more emphasis on controlling the quality of OERs. We believe that OER metadata play a crucial role here. If OER metadata is created more in line with OER quality control processes, automatic metadata analysis may significantly improve the evaluation of OERs. This is not the case currently, as very often only manual methods are used to validate both the quality of OER content and metadata , which are time consuming and unscalable solutions . Although, there are some attempts to automatize quality assessment of metadata , , , these only focus on the criteria definitions and metrics to evaluate already existing OER metadata [1, 11, 15] without building an intelligent model or models to predict the quality of OERs based on metadata.
In this work, we discuss the details of our exploratory data analysis on the metadata of 8,887 OERs from SkillsCommons555http://skillscommons.org in order to provide insights about 1) the quality of metadata in existing OERs, and 2) the effect of quality control on metadata quality. Based on our assumption that quality of OER metadata has tight relationship with quality of OER content, 3) we build a metadata-based scoring and prediction models to anticipate the quality of OERs. Finally, 4) we evaluate our proposed models by using the metadata of 841 OERs from YouTube666https://www.youtube.com/ to demonstrate the general nature of our proposed approach, when it is applied to different types of educational resources and repositories.
The article is organized as follows: Section II discusses the state-of-the-art of assessing the quality of OER metadata and also OER content using metadata. Section III explains the data collection and analysis steps, and the proposed approach of metadata scoring and prediction of OER quality based on metadata. Section IV shares the results of applying our model on Youtube educational videos in order to validate our proposed approach. Finally, Section V drives the conclusion and showcases our future work on this topic.
Ii Related Work
OER metadata is important not only to aid learners in finding relevant content among large amount of OERs, but also to indicate OER quality . In the literature, the quality of OER metadata has been determined in terms of the following dimensions: completeness, accuracy, provenance, conformance to expectations, logical consistency and coherence, timeliness, and accessibility . Ochoa and Duval have converted those dimensions into a set of calculated metrics, which have been reused by most of the researchers addressing quality of OER metadata . They partially evaluated their metrics (i.e., completeness, accuracy) on a list of 425 OERs from the ARIADNE Learning Object Repository .
Many OER metadata quality researches have mainly focused on the completeness of metadata, by means of the availability of metadata elements, the presence of their values , and the evaluation of those values . Pelaez and Alarcon have evaluated the completeness and consistency of OERs  by building their calculation on Ochoa and Duval’s metrics . They evaluated consistency of metadata elements values with respect to the standardized domain values (e.g. Language should be according to ISO 639-111 language standard). However, most of these researches are either conceptual ,  or focusing on one or more dimensions , . Therefore, there is a need for automatic and intelligent metadata quality assessment in order to improve the discoverability, usability, and reusability of OERs .
Based on the state-of-the-art, it is clear that: 1) it is worthwhile and timely to consider analyzing OER metadata to improve OER-based services; and 2) there is a lack of intelligent prediction models, which evaluate the quality of OERs based on their metadata to facilitate the quality control. For these reasons the main research questions and objectives of this article are:
Conducting an exploratory data analysis on large amount of OERs’ metadata.
Building a scoring model in a data-driven approach that helps OER repositories and authors to evaluate and improve the quality of their OER metadata.
Predicting the quality of OERs based on their metadata. This should guide automatic quality control processes and ultimately result in higher OER quality.
Iii Research Method and Data Collection
In this section, we explain our steps towards our proposed model. We organized our work into four steps. First, we collected and maintained a large dataset of OER metadata. Second, we performed an exploratory data analysis and deduced results. Third, we built a scoring model accordingly, and finally, we proposed a prediction model to anticipate the quality of OERs.
Iii-a Data Collection
We built an OER metadata dataset after retrieving all search results for the terms "Information Technology" and "Health Care" via the SkillsCommons platform API777http://support.skillscommons.org/home/discover-reuse/skillscommons-apis/. The dataset contains 8,887 OERs metadata888Our dataset can be downloaded from: https://github.com/rezatavakoli/ICALT2020_metadata. The OER metadata in our sample include the following fields: url, title, description, educational type, date of availability, date of issuing, subject list, target audience-level, time required to finish, accessibilities, language list, and quality control (i.e., a categorical value that shows if a particular OER went through a quality control or not). It should be mentioned that the quality control field means manual quality control and it has been set to "with control" if an OER had at least one inspection regarding the Quality of Subject Matter and at least one inspection regarding the Quality of Online/Hybrid Course Design, otherwise it is set to "without control".
For the Youtube dataset, 841 videos have been collected using Pafy python library999https://pypi.org/project/pafy/. For each of the 28 different topics in the areas of "Information Technology" and "Health Care", at least 10 videos were selected based on the top videos in Youtube search results101010Our Youtube dataset can be downloaded from: https://github.com/rezatavakoli/ICALT2020_metadata. The video metadata includes the following fields: url, title, description, number of dislikes, length, number of likes, rating, subject list, and number of views.
Iii-B Exploratory Analysis of OER Metadata
As a departure, we used our Skillscommons dataset to explore the availability of the metadata elements (i.e., level, language, time required, accessibilities) in OERs based on their quality control categories ("with control" or "without control"). Accordingly, we selected OERs "with control" and analysed their metadata elements (i.e., title, description, subject) to build our scoring model.
The results of the analysis are:
Target Audience-Level refers to the learners’ expertise/educational level in relation to a specific OER. Figure 1(a) illustrates how the quality control increases the availability of level metadata.
Language List refers to the available language versions of an OER. Figure 1(b) illustrates the effect of quality control in increasing the availability of language metadata.
Time Required refers to the expected duration needed to complete an OER. Figure 1(c) shows that it is more likely that OERs with quality control have this type of metadata.
Accessibilities defines the accessibility guidelines supported by an OER. Figure 1(d) illustrates how quality control increases availability of the accessibility metadata.
The plots in Figure 2 show a clear increase in OER metadata quality (i.e., availability) in the quality controlled OERs, which can be interpreted as a result of OER quality control. However, our analysis shows that the proportion of manual OER quality control in our dataset has been decreasing over the last years. This development is illustrated in Figure 1. We think that the growing number of OER providers and content are among the main reasons for this negative change in the amount of manual OER quality control. As results of this analysis, 1) we can use existing quality controlled OERs to define quality benchmarks for metadata elements, and 2) it is needed to define a method that facilitates the automatic assessment of OER metadata quality, and consequently the quality control of OERs. Therefore, as a final step of our analysis, we focused on OERs with quality control and screened the remaining metadata elements (i.e., title, description, and subjects) of these OERs:
Title refers to the title given to an OER. Figure 2(a) shows the distribution of title length (as number of words).
Description refers to the content summary of an OER. Figure 2(b) illustrates the distribution of description length (as number of words).
Subjects refers to the subjects (topics) which an OER address. Figure 2(c) shows the distribution of subjects (as number of subjects).
The plots in Figure 3
show that these features have distributions similar to normal. Therefore, it is possible to fit a normal distribution on them and build a scoring model based on the distribution parameters.
Iii-C OER Metadata Scoring Model
As the first step to build our scoring model, we defined the importance of each metadata field and a rating function based on those OERs, which went through quality control.
For this purpose, we set the importance rate of each metadata field according to its availability rate (between 0 and 1) among quality controlled OERs. For instance, all quality controlled OERs have a title and therefore, we set the importance rate of title to , and for language, we set it to 0.92 since 92% of the controlled OERs have language metadata. Accordingly, we normalised the calculated importance rates as normalized importance rate.
Afterwards, for each field, we created a rating function based on the OERs with quality control, in order to rate metadata values. The rating function of the fields (title, description, subjects) was devised by fitting a normal distribution on their value length as they have distributions similar to normal, as illustrated in Figures 3. We used the reverse of Z-score concept (as where and
is the mean and standard deviation respectively of the field in the dataset) to rate the metadata values based on the properties of the controlled OERs. Thus, the closer an OER title/description/subjects length is to the mean of distributions, the higher is the rate111111It should be mentioned that when a field value is equal to the mean the rate will be 1 and when it is empty the rate will be 0. Moreover, we used a boolean function for the four fields (level, length, language, accessibility) which assigns 1 when they have a value and assigns 0 otherwise. Table I shows the metadata fields, importance rate, normalized importance rate, and the rating function.
|Type||Importance Rate [0-1]||Normalized Importance Rate [0-1]||Rating Function [0-1]|
|Level||0.98||0.165||If available: 1; else: 0|
|Language||0.92||0.155||If available: 1; else: 0|
|Time Required||0.58||0.098||If available: 1; else: 0|
|Accessibilities||0.59||0.099||If available: 1; else: 0|
Finally, the following two scoring models were defined to cover the availability and adherence of the defined benchmarks:
Availability Model. We calculate the availability score of an OER as Equation (1) where is Normalized Importance Rate of metadata field . This score shows how complete that metadata is in a weighted summation, in which the normalized important rates are the weights. Therefore, the more an OER contains important fields, the higher the availability score is. For instance, an OER which has title, description, and level (i.e., with high importance rate), achieves a higher availability score than another one which has metadata for subjects, language, time required, and accessibilities.
Normal Model. We calculate the normal score of an OER as Equation (2), where is the Normalized Importance Rate of metadata field , and rating(o,k) is the assigned rating to OER based on the rating function of . This score shows how close metadata to the defined benchmark is (based on OERs metadata with quality control). With this scoring model, an OER which has the most similar metadata properties with the metadata of quality controlled OERs, achieves the highest normal score.
Iii-D Predicting the quality of OERs based on their metadata
We trained a machine learning model to predict the quality of OERs based on their metadata and our scoring model. Therefore, we got the OERs “with control” as higher quality class (containing 4,651 OERs) and set the remaining as lower quality class (containing 4,236 OERs). As a classifier, a Random Forest model was trained on the SkillsCommons dataset to build a model that makes a binary decision: high-quality/low-quality.
We used 80% of the data as a training set and the remaining 20% as test set. The classifier achieved an accuracy of 94.6%, where 95% of F1-score for "with control" class, and 94% of F1-score for "without control" class121212The implementation steps and results in Python can be downloaded from: https://github.com/rezatavakoli/ICALT2020_metadata. Moreover, we extracted the importance value of each feature for the classification task. Table II represents the features of our model and their importance score [0-1].
|Feature||Importance score [0-1]|
|Level Metadata Availability||0.23|
The importance values reveal the effect of each feature in our prediction model. The model assigns the highest value to the Availability Score and Normal Score features, which are the indicators we proposed. Thus, we can infer that these two indicators can illustrate the quality of OER metadata.
In this section, we report the results of applying our scoring and prediction models on our Youtube dataset, including the metadata of 841 videos in 28 subjects in the areas of "Information Technology" and "Health Care".
First, we applied our scoring and prediction models on the dataset to classify the videos into two groups: "with control" (higher quality) and "without control" (lower quality)131313In order to apply our model, we set our required fields based on the video properties. For instance, we set level availability based on the videos title and length availability to "available" as all videos have length metadata.. After classification, we got 446 videos "with control" and 395 videos "without control". Then, we needed to identify a metric in their metadata to compare the two groups in order to check whether our model detects the groups of videos with higher quality or not. Therefore, we decided to focus on video rating feature as a quality indicator from the users’ perspective, which is calculated based on likes and dislikes, and one of the most commonly used metrics of quality assessment of videos .
Finally, for each of the 28 subjects, we calculated the average of the videos rating for each of the predicted groups ("with control" as higher quality and "without control" as lower quality). Table III shows the subjects, the difference of the average rating between the groups, and the difference sign which specifies whether our model predicted correctly and the "with control" group has higher ratings (+) or not (-).
|Subject||Rating Difference||Difference Sign|
|women and nutrition||-0.08||-|
|smoking health risks||-0.01||-|
As per the results detected by our prediction model, the average rating in a group with higher quality has 0.05 higher video rating than the lower quality group. This is very reasonable considering the standard deviation of ratings in the dataset of 0.25. To further elaborate, the maximum difference between around 80% of the ratings is 0.25. Therefore, dividing them into two groups with a rating difference of 0.05, emphasises that our classifier works well in this context.
Additionally, 23 out of 28 subjects (82.1%), where our model detected higher quality groups, had higher ratings and this again supports the generalization of our model which can be applied also in different topics.
V Conclusion and Future Work
In this study, we collected and analysed the metadata of a large OER dataset to provide deeper insights into OER metadata quality, and proposed a scoring and a prediction model to evaluate the quality of OER metadata and as a consequence OER content quality. We deem that our proposed models not only help OER providers (e.g. repositories and authors) to revisit and think about the importance of the quality of their metadata, but also facilitate the quality control of OERs in general, which is essential in the light of rapidly growing number of OERs and OER providers. Applying our model on the Skillscommons dataset indicated that it can detect OERs with quality control with the accuracy of 94.6%. We also validated our approach in another context, by applying our scoring and prediction model to open educational videos on Youtube. The results show that our approach successfully detects videos with higher user rating values. The validation step indicates that our approach can be used on different OER repositories.
We consider this study as one of the first important steps to propose intelligent models to improve OER metadata quality and consecutively OER content. In the future, we plan to further improve and validate our models by collecting more data from other repositories and consider more metadata features (e.g. text-based analysis of title and description).
-  (2004) The continuum of metadata quality: defining, expressing, exploiting. In Metadata in Practice, Cited by: §I, §II.
-  (2017) Recommendation of open educational resources. an approach based on linked open data. In Global Engineering Education Conference, pp. 1316–1321. Cited by: §I.
-  (2015) Measuring quality in metadata repositories. In International Conference on Theory and Practice of Digital Libraries, pp. 56–67. Cited by: §II.
-  (2002) IEEE standard for learning object metadata. IEEE, New York. Cited by: §I.
-  (2018) Measuring completeness as metadata quality metric in europeana. In 2018 IEEE International Conference on Big Data (Big Data), pp. 2711–2720. Cited by: §I.
-  Learning Resource Metadata Initiative. Note: https://www.dublincore.org/specifications/lrmi/ Cited by: §I.
-  (2012) Quantifying and measuring metadata completeness. Journal of the American Society for Information Science and Technology 63 (4), pp. 724–737. Cited by: §II.
-  (2008) A conceptual framework for metadata quality assessment.. In Dublin Core Conference, pp. 104–113. Cited by: §II.
-  (2016) VQAMap: a novel mechanism for mapping objective video quality metrics to subjective mos scale. IEEE Transactions on Broadcasting 62 (3), pp. 610–627. Cited by: §IV.
-  (2008) A lightweight metadata quality tool. In Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, pp. 385–388. Cited by: §I.
-  (2006) Quality Metrics for Learning Object Metadata. World Conference on Educational Multimedia, Hypermedia and Telecommunications (2004). Cited by: §I, §II, §II.
-  (2009) Automatic evaluation of metadata quality in digital repositories. International journal on digital libraries 10 (2-3), pp. 67–91. Cited by: §I, §I, §II.
-  (2017) Metadata quality assessment metrics into ocw repositories. In Proceedings of the 2017 9th International Conference on Education Technology and Computers, pp. 253–257. Cited by: §II.
-  (2018) Exploring the provenance and accuracy as metadata quality metrics in assessment resources of ocw repositories. In Proceedings of the 10th International Conference on Education Technology and Computers, pp. 292–296. Cited by: §II.
-  (2019) A proposal of quality assessment of oer based on emergent technology. In 2019 IEEE Global Engineering Education Conference (EDUCON), pp. 1114–1119. Cited by: §I, §II.
-  (2005) Complete metadata records in learning object repositories: some evidence and requirements. International Journal of Learning Technology 1 (4), pp. 411–424. Cited by: §II.
-  (2013) Dealing with metadata quality: The legacy of digital library efforts. Information Processing and Management 49 (6), pp. 1194–1205. Cited by: §I.
-  (2014) Towards automatic quality assessment of component metadata. Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014, pp. 3851–3856. External Links: Cited by: §I.
-  (2015-10) Usability of metadata standards for open educational resources. External Links: Cited by: §II.