Abstract
In recent years, MOOC has gradually become an important way for people to learn knowledge. But the knowledge background of different people is quite different. Moreover, the precedence relations between lecture videos in a MOOC are often not clearly explained. As a result, some people may encounter obstacles due to lack of background knowledge when learning a MOOC. In this paper, we proposed an approach for mining precedence relations between lecture videos in a MOOC automatically. First, we extracted main concepts from video captions automatically. And then, an LSTM-based neural network model was used to measure prerequisite relations among the main concepts. Finally, the precedence relations between lecture videos were identified based on concept prerequisite relations. Experiments showed that our concept prerequisite learning method outperforms the existing methods and helps accurately identify the precedence relations between lecture videos in a MOOC.
1. Introduction
Nowadays, more and more people around the world are learning through online education platforms. The number of learners and courses of the top MOOC providers has also been rising rapidly. As of December 2020, 76 million learners around the world have already registered on Coursera, and the provider has also launched over 4600 courses (https://www.classcentral.com/report/mooc-stats-2020/).
As is well known, a MOOC consists of a series of lecture videos, and each lecture videos is an independent learning resource. The sequence of the learning resources in a MOOC is fixed. But every online learner has a different background, and not everyone wants to learn from the first lecture to the last one. Some learners prefer to choose videos of interest for learning, but they are not sure whether they have the appropriate background knowledge. The required background knowledge may be addressed in some former lectures, but the precedence relations between lecture videos in a course are usually ambiguous, and as a result, the learners do not know where to start.
A precedence relation between a pair of lecture videos establishes which one addresses more simple concepts and thus should be delivered first. In other words, the precedence relation between two resources is determined by the prerequisite relations among the concepts addressed by these resources. Here, a prerequisite refers to a concept or requirement before one can proceed to the next one. The prerequisite relation exists as a natural precedence among concepts in cognitive processes when people learn, organize, apply, and generate knowledge [1]. Concept prerequisite learning has played an important role in many fields, including precedence relation identification [2–4], concept graph extraction [5, 6], reading list generation [7, 8], curriculum planning [9], course sequencing [10], course design [11], knowledge tracing [12], and automatic assessment [13].
In this paper, we present a new approach for identifying precedence relations between lecture videos in a MOOC based on concept prerequisite learning. First of all, we extracted main concepts from the captions of every lecture video, and then measure prerequisite relations among concepts with an LSTM-based neural network model. After that, the precedence relations between lecture videos were inferred according to the concept prerequisite relations.
What we do can help learners not only know which main concepts are addressed in each lecture video but also understand the dependencies between different lecture videos. The dependencies between these videos can let learners know what background knowledge is needed before learning a video and let them choose the previous related videos to learn according to their personal circumstances.
Our main contributions are as follows:(i)A new approach to measure concept prerequisite relations outperforms other baseline methods.(ii)A novel metric to identify precedence relations between lecture videos in a MOOC based on concept prerequisite relations.(iii)A new dataset containing 1873 video pairs in “Food and Health,” “Algorithms,” “Calculus,” and “Global History.”
The rest of the paper is organized as follows. Section 2 discusses related work; Section 3 gives a brief introduction of our approach; Section 4 introduces the concept prerequisite learning algorithms; Section 5 describes the method for identifying precedence relations between lecture videos in a MOOC; Section 6 describes the new datasets we collected for this paper and reports our experimental results; in Section 7, there is the conclusion and a discussion of future work.
2. Related Work
Mining precedence relations between learning resources is a task that has gained much attention in recent years. Gasparetti et al. [2] presented an approach to predict a prerequisite relation among two learning resources. Given a pair of resources, a feature vector is built based on the Wikipedia pages of the concepts found in the learning resource text. Chen et al. [3] divided each lecture video into a length of 20-second-long segments and inferred precedence relations between the segments. Manrique et al. [4] also identified precedence relations between MOOC videos with their main concepts. However, their main concepts were extracted by experts manually. Obviously, it is a time-consuming work. This manual approach is not feasible when the number of videos is large.
The problem of discovering prerequisite relations between concepts has also been addressed by some researchers. Talukdar and Cohen [14] proposed an approach to model the prerequisite structure between Wikipedia concepts. They assumed that hyperlinks between Wikipedia pages indicate a prerequisite relation and designed several features, such as PageRank score and Random Walk with Restart (RWR) score via the hyperlink network. At last, they used a Maximum Entropy (MaxEnt) classifier to predict prerequisite relations between concepts. Liang et al. [15] also use the hyperlinks for measuring prerequisite relations between Wikipedia concepts. In their work, each concept was represented by a list of related concepts, and the authors calculated the reference distance (RefD) between two concepts to infer whether one concept is a prerequisite of another or not. Sayyadiharikandeh et al. [16] introduced an approach that inferred prerequisite relations between concepts with Wikipedia clickstream data, but a limitation of the approach is that Wikipedia clickstream data usually cannot cover all the concept pairs. Miaschi et al. [17] utilized linguistic features extracted from Wikipedia articles to measure concept prerequisite relations. To the best of our knowledge, it is the first work to use LSTM-based neural networks for concept prerequisite learning. It is a good inspiration for our work. Liang et al. [18] created 15 graph-based features and 17 text-based features for Wikipedia concept pairs and then employed four widely used binary classifiers: Naïve Bayes, Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF), to discover prerequisite relations between concepts. The UNIGE_SE team [19] developed a neural network classifier that exploited features extracted both from raw text and the structure of the Wikipedia pages to achieving automatic prerequisite learning for Italian concepts. Most of the above methods will be used as the baselines in our experiments.
In addition to Wikipedia, other types of learning resources have also been studied to measure prerequisite relations between concepts, such as university courses [20], textbooks [6], and MOOCs [21]. These works provide the foundation for a wide range of education fields, including learning resources sequencing, learning resource recommendation, and curriculum planning.
3. Our Approach
To identify precedence relations between videos in a MOOC, we need to extract the main concepts from videos, and then measure prerequisite relations between the main concepts. Based on concept prerequisite relations, we can identify the precedence relations between videos in a MOOC. The detailed steps are as follows:(1)Extracting captions from lecture videos in a MOOC(2)Selecting main concepts from the captions of every lecture video(3)Inferring concept prerequisite relations via a LSTM-based neural network model(4)Mining precedence relations between lecture videos based on the concept prerequisite relations
It should be noted that only existing Wikipedia concepts were considered as the main concepts of a video in our work. A Wikipedia concept is the title of a Wikipedia article. To the best of our knowledge, Wikipedia is the largest multilingual online encyclopedia on the Internet, and its content can provide services for MOOCs in different disciplines.
4. Concept Prerequisite Extraction
In this section, we will introduce how to measure prerequisite relations among concepts. Before introducing the details of concept prerequisite learning, we first define some relevant elements. The details are shown in Table 1.
In this paper, concept prerequisite learning is a binary classification task. Given a pair of concepts , the classification result (equation (1)) is whether is a prerequisite of or not.
Many concepts prerequisite learning algorithms require using artificial design features, such as the method proposed by Liang et al. [18]. In recent years, deep learning technology has been widely used in image recognition, voice processing, natural language processing, and other fields. Deep learning is essentially a feature learning method, which transforms the original data into higher-level feature representation through a nonlinear model, to obtain more abstract expression. In this paper, we employ an LSTM model to extract higher-level features of concept pairs from the raw input.
According to [15], a concept could be represented by its related concepts in Wikipedia’s concept space. Therefore, when measuring prerequisite relations of , we should not only analyze the two concepts themselves but also analyze their related concepts.
The specific classification process is as follows: First of all, the two concepts and , as well as their sets and , were respectively converted into four 128-dimensional vectors, i.e., , , , . Secondly, these vectors were input into four identical LSTM-based subnetworks with 32 units, and the four LSTM outputs were then concatenated and passed to a last Dense Layer. And we use Sigmoid as the activation function. Finally, we got the classification results of concept prerequisite learning. The architecture of the classifier is shown in Figure 1.

For a concept , there must be a corresponding Wikipedia article. To get the vector , each of the first 400 words in the article was converted into a 128-dimensional vector, and the vectors were built with word2vec [22]. We generated the vectors using the ukWac corpora [23]. We also removed the stop words in the article in advance. At last, was equal to the mean of the 400 vectors.
Similarly, for each of concept , we also calculated its vector according to the above method. Then, vector was equal to the mean of all vectors.
5. Precedence Relations Identification
In this section, we use concept prerequisites to help identify precedence relations between lecture videos in a MOOC. First, the lecture videos we used were all from the MOOC provider Coursera. For every lecture video, we got video captions with the toolkit coursera-dl (https://github.com/coursera-dl/coursera-dl). After that, we utilized another toolkit TextRazor to extract the main concepts from video captions. As mentioned above, the main concepts are all Wikipedia concepts.
However, in order to reduce the amount of calculation, we only extracted a few concepts from the subtitles of each video. TextRazor associates two attributes with each entity, called relevance_score and Wikipedia_link (https://www.textrazor.com/docs/python#Entity), which are used to estimate the “relevance” of Wikipedia entities to the input text. Here, the input text is lecture video captions, and the parameter relevance_score represents the relevance of the entity to the input text. For the extracted concepts, we sort them in reverse order according to the relevance_score and select the top k concepts. We will also set different values for the parameter k to study the impact of this parameter on the prediction of video dependencies.
A MOOC often consists of a series of lecture videos, as , given two lecture videos and . Normally, if depends on , must be ranked before . There are three possibilities for the relation between the two videos: (1) depends on ; it means contains some background knowledge for learning ; (2) the two lecture videos are related but have no precedence relation; and (3) they are unrelated.
The precedence relation between two lecture videos can be explained in terms of prerequisites among the concepts that they address. If contains some prerequisite concepts that are needed to learn , then it can be said that depends on . In fact, the prerequisite concepts are the background knowledge needed to learn . On the other hand, if there is no prerequisite relation among the concepts of the two videos, then there will be no precedence between the two videos. Note that, if two videos contain too many duplicate concepts, it is not easy to judge the precedence between them. So, we defined a parameter , if the number of duplicate concepts of a video pair is greater than or equal to , the video pair will not be considered in our work.
Given a concept-based representation for each lecture video , , we calculate precedence score between two videos as
Here, represents the degree of precedence of video on . Ultimately, whether depends on will be determined by
6. Experiments
In this section, we tested our approach from two aspects, one is the concept prerequisite learning method, and another one is the precedence relations mining method.
6.1. Concept Prerequisite Learning Experiments
First of all, we evaluate the performance of our concept prerequisite learning method on the AL-CPL dataset [24]. The AL-CPL dataset consists of binary-labeled concept pairs from four different domains, including data mining, geometry, physics, and precalculus. To perform concept prerequisite relation predicting experiments, we trained and tested the classifiers on concept pairs belonging to the same domain. The evaluation is performed using 5-fold cross validation. The performance was quantified with Accuracy (A), Precision (P), Recall (R), F1-score (F1), and Area under the ROC Curve (AUC).
Besides, we compare our model with five state-of-the-art models:(1)M1. The model was proposed by Miaschi et al. [17], the authors tested a neural network model (M1) which learned to classify the binary labels using two LSTM-based subnetworks. The embedding vectors of concepts and , i.e., and , were inputted into the two subnetworks, respectively. The two LSTM outputs were then concatenated and passed to a last dense layer (with Sigmoid activation).(2)NN. To verify whether the LSTM part is useful, we also omitted the LSTM part and directly concatenated the ’s and ’s embedding vectors and pass them to a last dense layer (with Sigmoid activation).(3)RefD. This baseline was proposed by Liang et al. [15]. In terms of the weight of a referred article, we chose the TFIDF weight to calculate the RefD of two concepts, which is better than the EQUAL weight.(4)BERT. This model, developed by Angel [25], corresponds to a single-layer neural network using concept pairs and its Wikipedia description extracted from the Italian-BERT model. Likewise, we obtain two 786-dimensional embeddings for a concept pair and the single-layer neural network maps the embeddings to an output space of dimension 2 as a precondition for the concepts.(5)GE. The concept prerequisite relations of video can be considered as a graph structure. Currently, in the deep graph learning areas, many models were proposed, which can use node, edge, and neighbor information together to generate features embedding. In our experiments, we also used a Graph embedding (GE) method, PyTorch-BigGraph (PBG) [26], as a baseline to compare our model with graph learning methods.
Table 2 shows that in terms of average F1 scores, our method generally performs better than the baseline method, reaching +4.8% for M1, +14.8% for NN, +25.3% for RefD, +1% for BERT, and +3.1% for GE. Also, we see that the performance of BERT is very close to ours. More specifically, the BERT method has higher F1 on the precalculus domain, which indicates that fine-tuned BERT is sufficient to obtain the prerequisite between concepts in this domain.
The proposed method is superior to the M1 method, because the M1 method only considers the text content of Wikipedia concepts but ignores the fact that related concepts can also provide rich background knowledge for understanding concepts. Besides, the proposed method is also better than the NN method, and we will explain the reason for this problem later. In addition, the RefD method also performs poorly, because the RefD method only uses one feature to predict the prerequisite relations of concepts. Furthermore, thanks to the excellent pre-training of the large-scale corpus, both BERT and GE perform well in concept prerequisite relation learning. But compared with our method, these models lack the word embedding of the main concepts, so our method performs better in several domains such as data mining, geometry, and physics.
On the other hand, because we use a LSTM model in our proposed method, we also investigate two questions about the LSTM model: Q1. Does the LSTM model help predict the prerequisite relations between concepts? Q2. Is the four-input LSTM model better than the two-input LSTM model?
We use two baseline methods, NN and M1, to investigate these questions. In the NN method, there is not an LSTM model. Given a pair of concepts , the embedding vectors of and are directly concatenated and then passed to a last dense layer to predict whether there is a prerequisite relation between the two concepts. In the M1 method [17], the embedding vectors of and are inputted into two LSTM-based subnetworks. And then the outputs of the two LSTM subnetworks are concatenated and passed to a last dense layer.
Table 2 suggests that both the proposed method and the M1 method perform better than the NN method with respect to F1-score across all four datasets. It shows that the LSTM model does provide some useful features that can be used to predict the prerequisite relations between concepts. Furthermore, the proposed method also outperforms the M1 method on all four datasets concerning F1-score. It means that another two LSTM subnetworks for the embedding vectors of and , i.e., and are also useful for the concept prerequisite learning task.
Besides, we also calculated the AUPRC for the three methods. AUPRC, namely, the area under the precision-recall curve, is a useful performance metric for a problem setting where people care a lot about finding positive examples (https://glassboxmedicine.com/2019/03/02/measuring-performance-auprc/). In Figure 2, we show the AUPRC of the three methods on the four datasets. Still, the proposed method performs best among all three methods on the four datasets. For example, the AUPRC of our method on Data Mining outperforms M1 and NN by 4% and 12%, respectively. We also have the observations that our method and the M1 method also perform very closely on both the Geometry and Precalculus datasets. Looking closely at the two datasets, we find that the related concept sets of the two concepts in a concept pair, i.e., and , have a higher degree of overlap. As a result, and have little effect on generating higher-level features. This is why the four-input LSTM model and two-input LSTM model perform relatively similarly on the two datasets. Table 3 gives two examples of overlapping related concepts.

(a)

(b)

(c)

(d)
6.2. Precedence Relations Mining Experiments
We selected four MOOCs from Coursera, including “Food and Health” (https://www.coursera.org/learn/food-and-health), “Algorithms” (https://www.coursera.org/learn/algorithms-part1), “Calculus” (https://www.coursera.org/learn/introduction-to-calculus), and “Global History” (https://www.coursera.org/learn/modern-world-2). The number of lecture videos from these MOOCs is 29, 59, 63, and 51, respectively. Table 4 shows the information of some lecture videos of the MOOC Algorithms. The first column presents the catalog number of the lecture videos in the MOOC, and the second column presents the lecture video titles, while the last column presents some of the main concepts of each lecture video. Moreover, we set the value of parameter k to 6. In other words, we select the top 6 main concepts from every lecture video.
We generated video pairs for every MOOC like this , . Then, to obtain the ground truth of precedence relations of all video pairs in the MOOCs, we asked students of corresponding majors to vote on the final results. Given a pair of videos , students had three options: (1) depends on (“1”); (2) does not depend on (“0”); (3) do not know (“”). If there is no option to get a majority vote or option 3 got a majority vote in a video pair, the pair will be discarded. The datasets have been published on GitHub (https://github.com/Morganbyh/MOOC).
In Table 4, we report the Accuracy (A), Precision (P), Recall (R), and F1-score (F1) obtained for the MOOCs. From Table 5, we can see that the number of precedence relations in “Food and Health” is the least, and the number of positive samples is less than 10%. In “Calculus,” however, the number of positive samples is close to 50%. In another two MOOCs, the number of positive samples is about 20%.
Further observation revealed that “Food and Health” presented only simple common-sense knowledge. There are few precedence relations between lecture videos. This limits our prediction of positive samples. It has the lowest recall of 71.24%. But the accuracy of this domain is generally better than the other three domains. We suppose that this is due to a large number of negative samples and the imbalance in classes. The highest F1 value appeared in “Calculus,” with 81.82%. For the other two MOOC courses, their F1 scores are all over 78%. Overall, our method successfully identifies the precedence relations of the MOOC lecture videos in four domains.
On the other hand, we also want to know whether the value of parameter k has any effect on the prediction of video precedence relations. Figure 3 shows the effect of different values of k (1, 3, 6, 10, and 15) on the prediction tasks. For every lecture video, we extracted k main concepts from their captions and used the concept prerequisite learning results to identify the lecture video precedence relations.

(a)

(b)

(c)

(d)
We extracted 279, 414, 408, and 611 concepts from four MOOCs, respectively (k = 15). When the parameter k is valued at 1, 3, 6, and 10, we will only use some of these concepts. Looking at four domains, we can see that when the value of k is too large or too small, the performance of precedence relation prediction is not very good. When the parameter k is set to 6, our prediction performances in three MOOCs are the best, including “Food and Health,” “Algorithms,” and “Calculus.” For “Global History,” when the parameter k is set to 3, the prediction performance of the precedence relations is the best. But when the parameter k is set to 6, the prediction performance is also better.
In fact, when the parameter k is set to 1, we may miss some of the main concepts of a lecture video. When the parameter k is valued at 15, we may consider some unimportant words as the main concept of the video. The relevance scores of the words may be very low. Take “Algorithms” as an example, when the k is 15, there will be some general concepts in every lecture video, such as “Data structures,” “Algorithms,” “Classes,” “Sorting,” and “Arrays.” These general concepts are not helpful for precedence relation prediction.
7. Conclusions and Future Work
We studied the problem of automatically mining precedence relations between lecture videos in a MOOC. We extracted the main concepts from video captions and analyzed the prerequisite relations between the concepts. After that, the concept prerequisite relations were used to infer the precedence relations between lecture videos. Experiments showed that our concept prerequisite learning method outperforms the existing methods and helps accurately identify the precedence relations between lecture videos in a MOOC.
Promising future direction would be using Wikipedia independent methods to measure prerequisite relations among concepts, so that more accurate main concepts can be selected from video captions for the precedence relation identification. Besides, we can also use the learner behaviour to identify the precedence relations between lecture videos. When users perform personalized learning on a MOOC platform, the order in which they learn videos often implies the precedence relations between the videos.
Data Availability
The CSV data used to support the findings of this study have been deposited in the GitHub repository (https://github.com/Morganbyh/MOOC). These data are the results of user voting for the video precedence relations. There are no restrictions on access to these data.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61977021), the Technology Innovation Special Program of Hubei Province (Nos. 2018ACA133 and 2019ACA144), and the Teaching Research Project of Hubei University (No. 202008).