Abstract

The traditional English teaching mode is that the teacher simply imparts textbook knowledge and students understand and absorb it. However, this method has obvious problems; that is, it is difficult to ensure that students can quickly understand the content taught and cannot get fast feedback. The video feedback method is a method of English teaching combined with audiovisual technology. Teachers can use video technology to record key English knowledge through video and then give feedback and explain the learning through video. At the same time, the intelligent video feedback system teaching method will greatly improve the teaching quality of English classroom and make students’ learning more fun. This paper mainly designs the English teaching system based on the video feedback method and finally realizes the intelligent feedback scheme of the English teaching system. Firstly, the neural network method, classification method, and video shooting technology are used to extract and predict the characteristics of students’ classroom expressions, speech, and so on, and analyze through the video feedback system of English classroom. The research results show that the classification method proposed in this study can better complete the body movements such as student expressions and speech collected by the English teaching system, and the neural network method can more accurately predict and feed back the teaching content through the video feedback method with the largest error. It was only 2.98%, and the linear correlation also reached more than 0.98. The minimum error of the video feedback information is only 0.95%, and the prediction errors of the other two kinds of English classroom information are also 2%. The classification and prediction of intelligent video feedback information have achieved good results.

1. Introduction

The traditional English teaching method is that teachers teach students through blackboard writing through textbooks and teaching plans, and students receive them through listening and recording. Due to the difficulty of English teaching and teaching of other subjects, it is easy for students to lose interest in English subjects [1]. This method of teaching is also difficult to ensure high efficiency, and it cannot timely reflect the effectiveness of teachers’ teaching methods. In recent years, with the development of computer hardware [2] technology, computer-aided systems have improved the way of English teaching, and the teaching efficiency has been improved to a certain extent [3]. Compared with the traditional English teaching mode, this mode only changes the medium of teaching. It also has certain defects. It cannot provide timely feedback according to the actual situation in the English classroom, nor can it provide real-time feedback according to the actual situation in the English classroom [4]. Actions shown are recorded and given feedback. Both of these two methods limit the further improvement of English teaching efficiency, which also limits the improvement of students’ interest in learning English. The video feedback method is a new English teaching mode, which will change the disadvantages of the traditional teaching mode [5].

The video feedback method has been widely used in many applications, such as tennis teaching classrooms, traffic teaching classrooms, and other fields, which has shown certain feasibility and advantages in other fields [6]. The video feedback method is a teaching feedback method based on audiovisual technology. It can record the teaching content to form the classroom content used in teaching and can realize functions such as playback, zooming, and slow playback [7]. For the application of the video feedback method in the English teaching classroom, it can record a course, and the teacher analyzes and slows down the English video through computer-aided calculation. For difficult content and knowledge points, teachers can teach by means of pause, slow release, and circulation, which can greatly improve students’ memory ability and learning interest [8]. At the same time, machine learning methods have developed rapidly in recent years, and many intelligent classification and prediction methods have been derived. The English teaching system can combine video feedback methods and integrate elements of intelligent classification and prediction to realize intelligent English feedback method teaching [9]. In this way, the responses of students and teachers in the English teaching classroom can be recorded in real time, and the responses can be predicted and analyzed through the terminal of the intelligent computer-aided system.

In recent years, the English teaching mode has undergone great changes. It is committed to finding a teaching mode that is more efficient and more suitable for students. A lot of research has also been carried out on the research of English teaching methods [10]. Xu and Tsai [11] took the difficulties in college students English vocabulary learning as the research breakthrough point; they analyzed the relationship between college students and electronic media and the impact of interactive teaching mode on college students’ vocabulary memory. At the same time, they studied the theory of multimedia-based interactive English teaching mode and intervention model for English learning adaptation. Zhao [12] believes that oral English teaching is the worst part of multimedia teaching. In response to this problem, he proposed a data mining teaching model. This model first uses the DBN network to send information to the DBN-DELM network, which significantly improves the multimedia performance. The efficiency of the English teaching mode also improves the learning interest in pronunciation and accent in oral English teaching. In view of the problems existing in computer English teaching, Liu [13] used SPOC method to research and analyze the problems in IT English teaching, from the perspectives of collecting student data, uploading relevant English resources, and doing a good job in teaching design. The conclusion shows that the SPOC flipped English teaching method proposed by him improves the effect of daily English teaching and also improves students’ satisfaction with English classroom and daily English learning time. Li [14] believes that the teaching level of oral English represents the level of English teaching, and good oral ability is a value for evaluating English teaching. He has used multi-interactive multimedia technology to take the environment, language sense, emotion, and other factors into consideration in English teaching. The two interactive modes of classroom interaction and vitality improve the interaction of English teaching compared with the traditional oral English teaching mode. Zhang [15] has used image recognition technology to design a new way for English teaching mode. Image superresolution technology GAN model is used by him to reconstruct superresolution images from low-resolution images in English teaching. At the same time, he established a mixed-sample spine regression model to estimate the behavioral characteristics of learning in English teaching, and the validation test showed the feasibility and accuracy of this method. Han [16] studied the feasibility of online English teaching mode by combining deep learning methods and remote supervision, and he simulated and analyzed the application of supervised learning algorithms in English teaching. Two perspectives, student evaluation and teacher evaluation, were used to verify the feasibility of the model. The conclusion shows that this model is suitable for the online teaching task of English. Xie and Ma [17] improved the traditional MOOC English teaching model based on cloud computing technology and artificial technology and corrected and verified the MOOC model according to the situation in English cross-cultural communication and the needs of the online teaching model. The results show that the improved MOOC model can improve English. The efficiency of teaching cross-cultural communication. Zhao [18] found that the demand for English in all walks of life is relatively high. In response to this problem, he proposed a SPOC English teaching model and carried out experimental verification. This model has good applicability in vocational English teaching. Gao [19] introduced the data mining software Clementine into the English teaching work, extracted the information hidden in the English teaching evaluation system, and determined the minimum support degree and the minimum confidence degree. This method has high performance in English teaching evaluation management. The applicability can better reflect the actual value of teachers.

Through the above review of the design and research of the English teaching method system, it can be found that the current English teaching research mainly focuses on the research of computer-aided systems, and a small amount of research involves the research on the English teaching method of the video feedback method, the English teaching mode based on the intelligent video feedback method will change the shortcomings of the traditional teaching model of single teaching, and it can form a feedback learning system. For students, they can discover the defects and problems of learning English through the video feedback system in a timely manner, thereby making corrections in a timely manner [20]. There are few English teaching modes. Aiming at the above-mentioned research status and the shortcomings of current English teaching, this paper designs an intelligent video feedback English teaching mode [21]. The English teaching system based on the video feedback method will use the video recording technology to record the speech, body movements, and teaching courseware content of the students and teachers in a course to form a video [22]. Then, these videos will be classified and intelligently predicted and analyzed through computer-aided systems and intelligent Internet technology, and the recorded videos will be displayed to students in the form of videos [23]. Teachers can slow down and play back the important and difficult knowledge of the videos to achieve English. Teaching feedback technology and intelligent Internet technology can provide intelligent algorithms and knowledge from the Internet to students [24].

This paper mainly designs and predicts the English teaching system based on the video feedback method and intelligent algorithm. The first part mainly introduces the defects of the English teaching mode and the advantages of the video feedback method for English teaching [25]. The second part introduces the significance of intelligent video feedback technology for English teaching. The third part explains the method and process of the English teaching mode based on the video feedback method. The fourth part analyzes the feasibility of intelligent video feedback technology in English teaching from the perspective of accuracy and error, and the last part is the summary of the article.

2. The Significance and Necessity of Intelligent Video Feedback Method for English Teaching and Data Sources

2.1. The Necessity of Intelligent Video Feedback Method to Improve English Teaching

Video feedback is a new technology that integrates audiovisual technology. Its development benefits from the rapid development of computer hardware equipment and streaming media technology [26]. The video feedback method has been applied in many fields, and it shows good results. English teaching is a relatively cumbersome subject among many subjects. It is difficult to stimulate students’ interests and hobbies. If we only rely on traditional teaching methods, because it is a language subject, it has a lot of cumbersome grammatical information and expression habits. Information is a discipline that requires long-term persistence [27]. At the same time, in English teaching, there are many boring grammar sentences, and so on, which require students to memorize and review constantly. If a new English teaching mode is produced, it can not only impart English knowledge to students, but also stimulate students’ interest in learning, which is a suitable method for English teaching. The computer-aided system teaching method is a relatively new teaching mode [28]. It can transmit English [29] knowledge to students through video, but it has certain defects. It cannot be carried out according to the difficult points of each student and most students. Targeted teaching cannot show the performance of students in real time [30]. The video feedback method can record students’ performance and classroom content in the form of video, which will be analyzed and displayed to students in a targeted manner. At the same time, the feedback mechanism of the video feedback method can well integrate the algorithm of intelligent prediction and classification, which can better find solutions to similar problems according to the students’ response [31]. This method can not only effectively analyze students’ performance through video, but also match students’ difficult points in a similar context, which greatly stimulates students’ interest in learning English and their understanding of English. And it also can improve the memory ability of trivial knowledge [32]. The traditional English teaching mode has been going on for many years, but the current social English teaching mode is different from the previous teaching of grammar, sentence patterns, and so on, and now the teaching of spoken English is more demanding. The expression of spoken English has a greater correlation with students’ emotions and mouth shapes. The video feedback method can better show students’ performance to students for error correction, which cannot be achieved by traditional teaching models.

2.2. Data Sources of English Video Feedback Teaching Analysis

The purpose of this paper is to use video technology to collect audiovisual information of students and teachers in English classroom and courseware content, process it into video and save it. At the same time, this video information needs to be classified and predicted in combination with intelligent algorithms, and finally the information is displayed to students and teachers through the computer-aided system terminal for them to learn and summarize feedback knowledge. Video technology can collect students’ language and image information, such as speech and body movements. Intelligent algorithms need to extract features from these images and language information and perform intelligent prediction and classification. For the English teaching classroom, the content of the courseware and the verbal responses and expressions of the students are more critical. Therefore, this paper needs to use the content of the courseware, the expressions of the students, the physical movements of the students, and the words of the teachers saved by the video technology as the data sources of the intelligent classification prediction algorithm. First of all, intelligence needs to effectively classify the four types of information obtained from video technology and use the decision tree method, and then combine the Internet technology to predict the classified information, and then analyze the difficulties faced by students in English learning, and then feed back through the video. Technology will effectively learn these difficult points. This method can not only extract effective information according to the content of students and courseware, but also analyze difficult points in combination with the Internet. At the same time, it can also combine the computer-aided system to realize the English teaching mode of video feedback method, which will be more pertinent.

3. Algorithms and Technologies Used in Intelligent Video Feedback English Teaching Method

3.1. Introduction to the Video Feedback Method

Feedback teaching method is a teaching method combining system statistics theory, information theory, and cybernetics, which can guide teachers and students to be in a relatively harmonious teaching environment, rather than the traditional teaching-listening mode. In this teaching mode, students need to spend time to obtain certain learning outcomes and feed back these learning outcomes to learners, a way for learners to give feedback and improve learning based on the results. The video feedback method is a special method in the feedback teaching method. It uses audiovisual technology to record the behavior and posture of the students during the learning process. Teacher will show these videos to the students through the computer-aided system and guides and corrects them. Students make adjustments to their learning styles based on this feedback. After students make adjustments, teachers will further video record their learning behaviors, which can supervise students’ bad habits and poor learning outcomes in the process of learning, and correct them in time. The video feedback English teaching method adopted in this study not only uses audiovisual technology to record and feedback students’ learning process, but also provides a feedback method of intelligent analysis. This model will automatically classify images captured by video technology. And make intelligent predictions on the classification results. Figure 1 shows the working mode of intelligent video feedback English teaching. First, teachers will use video to record students’ speech, body movements, courseware content, and so on, in the process of learning as the data source of intelligent analysis algorithm. The tree and intelligent prediction methods give feedback on the learning effect of learning, and these feedback results will be displayed to students for viewing and analysis through the computer-aided system. This method will not only realize the feedback technology in the English teaching process, but also realize the intelligent interaction effect in the English teaching.

3.2. Classification Method Decision Tree for Intelligent Video Feedback

The video feedback method is to record students’ learning behaviors through videos, which will contain many types of data, such as students’ expressions, speech, and courseware. If it is not classified effectively, inputting these data directly into the intelligent prediction model will produce poor results, which is not good for intelligent feedback in English teaching. Therefore, it is necessary to classify the student behavior data collected by video technology. There are many classification methods with good performance, such as clustering, decision tree, support vector machine, and other methods. Because the data in this study was collected through video technology, and the types of features that need to be classified are relatively obvious, such as student speech, courseware teaching content, student expressions and other behavioral information. Therefore, this study will adopt decision tree as an effective classification of video information in English teaching in this paper. The purpose of the classification method in this paper is to classify the English classroom information collected by video through decision tree, and then input these classification features into the neural network to predict it, and finally achieve the purpose of intelligently assisting video English teaching, rather than simply using video feedback method. Through the intelligent video feedback method, relevant information from the Internet can be collected, which will make students more interested in receiving relevant information about English teaching. Figure 2 shows a schematic diagram of classifying the information collected by video technology through the decision tree method. It can be seen that the English video information is effectively divided into student speech, student expression, courseware content, and other information. At present, in the field of English teaching, the most common classification algorithms are decision trees and clustering methods. The clustering methods are mainly based on distance and density-based classification methods. However, this paper uses the intelligent video feedback system to collect student behavior information. There are obvious differences in the characteristics, and the decision number method is more suitable.

Entropy represents the measure of uncertainty. The smaller the entropy, the better the classification effect of the decision tree. This is an important evaluation index of the decision tree. The following equation shows the expression of entropy:

Equation (2) illustrates the expression for conditional entropy, which is a measure of uncertainty that represents D in the case of condition A. Conditional entropy is also an important evaluation index in decision trees.

Information gain describes the degree to which the uncertainty of feature A for classification dataset D is reduced, which is also a conditional probability event. It is also an important indicator for the classification of English video information, especially for video feedback. The following equation shows the expression for the information gain:

For the classification task of the English video feedback problem, the index of the probability distribution of the sample points belonging to the Lth class is defined as the Gini index, and the expression is shown as follows:

The samples of video information in English teaching work are often displayed in the form of sets, and the expression of the Gini index of the sample sets is shown as follows:

The following equation reflects the probability Gini index of a conditional distribution. For classification tasks with conditions, this Gini index is often used to express as

3.3. The Prediction and Extraction Method of English Teaching Video Feedback Information

This research is not only to realize the English teaching mode of video feedback method, but also to realize the intelligentization of video feedback by combining intelligent algorithms. Teachers use audiovisual technology to classify the collected student behavior information and courseware content in the English classroom and extract these behavior features through intelligent algorithms and then feed the information back to students. The intelligent prediction method adopted in this paper adopts the convolutional neural network method. The convolutional neural network has been proved to have obvious advantages in extracting features, and it allows deeper networks to extract video features more accurately. The input of the convolutional neural network is the classification data of English classroom video information obtained by the decision tree method in the previous section, and the output is the analysis of the students’ effective behavioral characteristics, which will be used as the final result of the video feedback method, and then used for students and teachers. And then students and teacher will use it for reference. Figure 3 shows the prediction process of video feedback information for English teaching through a convolutional neural network.

Convolutional neural network is also a special algorithm of perceptron, which is also a nonlinear operation in accordance with weight and bias. Equation (7) shows the operation rules between layers of convolutional neural network. The product of the input data and the weight plus the bias is summed, and the output is passed through an activation function. is the input datasets, and is the weights of the model. f is the mapping relations.

In a convolutional neural network, there is a pooling layer whose purpose is to extract features with strong correlation to reduce the amount of computation and reduce the risk of overfitting. The pooling layer generally has two ways of upsampling and downsampling. Equations (8) and (9) illustrate the process of upsampling and downsampling, respectively.

The following equation illustrates the automatic derivative operation function during the convolution operation:

For student speech, a common information feature in English classrooms, it has obvious temporal characteristics. Equations (11) and (12), respectively, illustrate the processing flow for the temporal feature of student speech, which can memorize historical information. W is the weights of the model, and is the state data at the moment. And is the historical status information data.

3.4. Data Processing of English Teaching Video Feedback System

From the description of the above three sections, it can be seen that the video feedback English teaching method designed in this study is not only a single video feedback idea, but also an intelligent video feedback system. Two intelligent algorithms are involved in this process, so it is necessary to perform normalization processing and feature normalization processing on the information data collected by these video feedback systems. The video feedback system will collect information such as words, expressions, and courseware content learned in the English classroom. It can be clearly seen that these characteristics are not within an order of magnitude range. In order to more accurately feed back students’ English classroom behavior information, it is necessary to process the video data in the process of intelligent classification and prediction so that they keep the same distribution and the same magnitude range. In this study, the standard normal distribution method is used to preprocess the information data of the English video feedback system in order to better distribute the weight distribution.

4. The Feasibility and Accuracy Analysis of Intelligent Video Feedback English Teaching Method

The English teaching mode based on the video feedback method can be classified as the image features and language features by the decision tree method, and the students’ expressions, speech, and courseware content can be classified well. This is the first step to realize the intelligent video feedback system. Figure 4 shows a schematic diagram of the classification of the four features collected by the video feedback system. It can be seen that the classification errors of all English classroom information features are within an acceptable range, and the maximum error is only 2.98%. This error comes from students since the speech characteristics of students are constantly changing with time and courseware content and the emotional characteristics of students; this characteristic information is more difficult to classify. The classification errors for the other three types of features are all within 2%. These three types of features are mainly courseware content, student expressions and actions, and so on. It can be seen that the correlation between these features is relatively large, and the variability of these features over time is also relatively large. Therefore, the classification error of these three features is relatively small, and the smallest error is only 0.95%. Compared with the behavioral information of students, the content of the courseware used in teachers’ teaching has little change over time, and the error is 1.79%, and the error of another kind is only 1.48%. In order to improve the classification error of students’ speech features, the weight of this part can be appropriately increased or more ratios can be collected. It can require to increase the number of sample features. Figure 5 shows the distribution of hotspots after classification of four different English classroom features. It can be clearly seen that the distribution of hotspots in different locations and time points is relatively consistent. In general, the feature information of the English video feedback system can be better classified by means of decision tree.

After the feature information of the English teaching video feedback system is classified by the decision tree method, this feature information needs to be input into the intelligent prediction system to match the information on the Internet. This part mainly shows the prediction accuracy of the English teaching video feedback features. Figure 6 shows the change trend between the predicted value and the actual value of the students’ speech features of the English teaching video feedback system. The reason for choosing this student speech feature as a prediction is that this feature is less efficient in classification. This feature is a feature that changes with time. It can be clearly seen from Figure 6 that the predicted student speech is in good agreement with the actual student speech feature, whether it is the change trend of the feature or the peak and trough of the speech feature. Overall, this intelligent prediction algorithm can well match the characteristics of students’ speech information collected by the English teaching video feedback system. Figure 7 shows the predicted curve and the actual change curve trend of the feature of student speech. From Figure 7, it can be seen more intuitively that the error between these two values is relatively small; that is, the area of the red area is relatively small. The main error of students’ speech is mainly concentrated in the latter part of the video system information, which is mainly due to the increasing error caused by the accumulation of time. However, in general, this error is acceptable for an intelligent video feedback system for English teaching. From Figure 7, it can be clearly seen that the error is relatively large at the inflection point of the curve, which may be due to the existence of variables that change greatly over time in the English teaching classroom, such as students’ behavior information, which leads to a large error.

Linear correlation is an important indicator to measure the prediction accuracy of the video feedback system. It can more intuitively reflect the difference between the student’s speech and the real student’s speech, which can ensure that the video feedback system will more realistically respond to students’ real learning of English situation. Figure 8 shows a plot of the linear correlation coefficient between the predicted and actual speech of the students. From Figure 8, it can be clearly seen that the two data maintain a good linear correlation, and the data points are well distributed on both sides of the linear function, which proves that this intelligent algorithm is suitable for intelligent video feedback systematic. It can also be clearly seen from Figure 8 that the correlation between the prediction of the video feedback information of the students’ speech and the actual data has exceeded 0.98, which proves that this model has good accuracy in the English teaching video intelligent feedback system. Figure 9 shows the normal distribution of the feature information prediction of the English teaching video feedback system. It can be clearly seen that the predicted values are well distributed in the normal interval, and the confidence level has reached 95%. This further proves the feasibility of this intelligent prediction algorithm in the English teaching video feedback system.

5. Summary of Intelligent Video Feedback English Teaching Plan

The knowledge will be imparted by a peer-to-peer format for the traditional English teaching model, which is difficult to ensure that each student can more efficiently learn English knowledge, especially in such a tedious subject as English. Computer-aided systems only change the mode of teaching, and it cannot reflect the actual learning situation of students. The video feedback method is a specific audiovisual technology learning feedback method. Students and teachers can analyze and timely feed back according to their own learning status and achievements and then correct the shortcomings in the English learning process.

Aiming at the characteristics of the video feedback method and the characteristics of English subjects, this research designs an intelligent English teaching video feedback scheme and classifies and predicts the characteristics of video information. Teachers can use cameras to record students’ speech, students’ facial expressions, and courseware content in English class, and then the information obtained from the video recording can be well matched to the actual situation of students through decision tree and intelligent prediction method, and finally the computer-aided system can be used to record the students’ actual situation. Feedback information is output, and students can conduct targeted analysis and corrections. Both decision tree and intelligent prediction method are more suitable for classification and prediction of English teaching video feedback system information. The maximum error of decision tree classification is only 2.98%, which appears in the behavioral feature of students’ speech. For the prediction of video information, the intelligent prediction algorithm also matches the changing trend of the actual students’ speech characteristics and the peak and valley values, and the correlation coefficient also achieves a good effect.

Data Availability

The dataset can be accessed upon request to the corresponding author.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by the 2021 Undergraduate Education Reform Project of Guangxi, Research and Practice of Ideological and Political Education of College English Courses in Application-Oriented Colleges and Universities in Guangxi (no. 2021JGA346).