Abstract

Current sports video image tracking methods cannot get the average gray information of pixels, which leads to the low segmentation accuracy of target trajectory. Therefore, this paper proposes a method of enhancing the motion track of sports video object based on OTSU algorithm. The OTSU algorithm is introduced to segment the unsupervised image automatically based on threshold calculation. Combined with 3D histogram, the OTSU algorithm is optimized to overcome the influence of noise and nonuniform illumination and to improve the segmentation performance. Using the optimized OTSU algorithm, the preprocessing of sports video image is realized by gray level transform, median filter, and binarization. Based on this, the object motion model of sports video image is constructed, and the edge information of sports video image is extracted. Using binocular stereo vision technology, according to the image target motion model and the current space coordinates, calculate the motion track, and obtain the parameters of sports video image. Experimental results show that the proposed method can segment the target accurately in a shorter time, and the enhancement effect of target trajectory is obvious, and the error is small.

1. Introduction

With the development of sports in China, there are more and more large-scale sports events, and there are a large number of sports videos. It is of great practical significance to analyze and study sports video and find the shortcomings of each other or themselves, which is of great practical significance to formulate training plans and competition schemes. Sports target tracking is an important aspect of sports video analysis, which has attracted people’s great attention. In the athlete training video, coaches can obtain human motion parameters by using relevant processing methods such as segmentation and tracking. In various competition videos, a task that attracts the attention of professionals is target detection, tracking, and trajectory analysis [1]. For example, in the offside event in a football game, we can track the ball and player, get the position of the ball and player, and then show the passing route and process in the reconstructed panorama, so as to detect the offside event. The distance between the ball and the goal can also be used to judge the occurrence of possible shooting events [2].

In order to further study the motion trajectory of sports video targets, scholars in related fields have obtained some good research results. Reference [3] uses the principle that fast R-CNN can quickly locate and extract features to quickly determine the target area. On this basis, a distance plus threshold limit method is proposed for target correlation. At the same time, this method makes use of the characteristics that convolutional neural network can extract image features efficiently. While ensuring the trajectory extraction accuracy, it reduces the time and amount of data for determining the target area and further reduces the time of target association by uniformly extracting and associating with the target reference point. However, when recognizing sports images, this method cannot obtain the average gray information of pixels, resulting in the unobvious enhancement effect of track tracking image. Reference [4] first estimates the speed of the data collected by the RFID reader and the laser sensor, smoothes the estimated speed of the laser sensor by using the weighting factor, and then matches the moving target speed obtained by the laser sensor and the RFID reader through the similarity comparison method. The particle filter algorithm is used to fuse the moving target velocity obtained by laser sensor and RF sensor, and the accurate moving target trajectory tracking is realized through the three stages of prediction, update, and resampling of particle filter. However, this method cannot accurately identify the target of the game video and has low practicability to assist the coach in formulating tactics. Reference [5] proposed a contour feature extraction method of multiple moving images based on parallax information. Based on the traditional snake model, the proposed method takes the parallax between different control points and centers as the active contour model to shape the contraction force and expansion force, which can effectively assist the initial edge contour curve to approach the real edge contour gradually. The secondary extraction of the initial edge contour and taking it as the initial contour of the model can realize the effective convergence of the image contour of the target depression area. However, the segmentation accuracy of target motion trajectory is low.

In order to help coaches formulate team formation, attack route, passing route, and other behaviors and strategies and improve the tracking effect of sports video image target trajectory, a sports video image target trajectory enhancement method based on OTSU algorithm is proposed.

2. OTSU Algorithm Optimization

OTSU algorithm [6] is an automatic unsupervised image segmentation algorithm based on threshold calculation, which has the characteristics of simple operation theory and clear physical meaning. The basic idea of the algorithm is to dynamically determine the threshold of image segmentation. Using the different gray histogram of the target image and background, the corresponding binary image is obtained [7].

Assume that the gray level of the digital image is described as and the number of pixels at the gray level is expressed as , so that all the pixels of the image can be known, and the following calculation formula can be obtained:

Set to represent the occurrence probability of pixels with gray level in the image. Normalize the one-dimensional histogram to obtain the statistical characteristics of the one-dimensional histogram:

According to the statistical characteristics of one-dimensional histogram of image, a one-dimensional OTSU algorithm is proposed. If the difference between target and background is large, the corresponding gray value is the best segmentation threshold of the image [8, 9]. Although the image processing efficiency of one-dimensional OTSU algorithm is fast, it is difficult to achieve satisfactory segmentation effect under the influence of noise and uneven illumination. In order to effectively prevent this situation, this paper proposes a three-dimensional histogram OTSU algorithm, which combines the average gray value of the pixel neighborhood with the original one-dimensional gray value. When using three-dimensional histogram OTSU algorithm to segment remote sensing image, the average gray value of pixel neighborhood is combined with the original one-dimensional gray value, and the average gray information of pixel neighborhood is introduced to reduce the influence of image interference and improve the quality of image segmentation. The specific implementation process of three-dimensional histogram OTSU algorithm is as follows:

The gray level of image is and the size is , so the corresponding gray value can be obtained according to the value of a random pixel point in the image.

The image after preprocessing shall be scanned correspondingly, and a relatively smooth image shall be obtained according to the values of for all pixels in the computed image, and the relevant image points shall be set as . Suppose all pixels are divided into two parts, namely, the target part and the background area, marked as and , in which is expressed as the 3D threshold vectors to be segmented, and then, the gray threshold and the neighborhood gray threshold can be obtained by calculating the results [10]. The 3D histogram is divided into four parts, and the expression for calculating the probability of occurrence of the target and background areas is

Optimal threshold vector satisfies

Compared with OTSU algorithm based on one-dimensional histogram, OTSU algorithm based on three-dimensional histogram can effectively overcome the influence of noise and nonuniform illumination.

3. Image Preprocessing Based on OTSU Threshold

Because of the complexity of the edge information in the sports video image, it is necessary to preprocess the image, which can remove the noise of the redundant information in the image.

3.1. Gray Level Transformation

Grayscale transform is a very important means of image enhancement. It changes the grayscale value of each pixel in the original image point by point according to a certain target condition. Dynamic range of the image can be enhanced, so that the image has a higher contrast, the main content is clear and significant [11, 12]. The gray level transformation method can be divided into linear transformation and nonlinear transformation. This research adopts linear transformation; the concrete transformation process is shown in Figure 1.

In Figure 1, the gray range of the original image is set to , and the gray range of interest is set to . If the gray range of the image is stretched to the range after the transformation, then the gray range within the range is stretched, and the gray range at both ends is compressed. The corresponding piecewise linear transformation formula is as follows:

The three-stage linear transformation formula in formula (5), respectively, corresponds to the stages of the original image in Figure 1 as , , , and the grayscale transformation is realized according to the above process.

3.2. Median Filtering

Median filtering is mainly used as a nonlinear smoothing technique, which can set the gray value of each pixel in the digital image to the median of the whole pixel in a neighborhood window. For the median filter of , it can filter the noise less than or equal to pixels in the neighborhood and can get the edge information of the image more completely. So median filtering is widely used in image processing field [13, 14].

Two-dimensional median filter usually uses or the sliding window to move it according to the order of the image data matrix. As you move, the gray values of pixels in the sliding window are sorted, replacing the gray values of pixels in the center area of this window based on the values between them.

Based on the sliding window, the flow of 2D median filtering is shown in Figure 2.

The implementation process of median filtering shown in Figure 2 is as follows: replace the value of a point in image with the median value of each point in a neighborhood of the point (Figure 2(a)), so that the surrounding pixel values are close to the real value, so as to eliminate isolated noise points; that is, use a two-dimensional sliding template to sort the pixels in the board according to the size of pixel values (Figure 2(b)) and generate a monotonically rising or falling two-dimensional data sequence (Figure 2(c)).

3.3. Binary Processing

Binarization processing mainly divides image into two parts: target object and background area, which is the basis for edge extraction. In view of the redundant information and noise phenomenon in the image, it is necessary to select an appropriate threshold to divide the image accurately. If it is greater than , it will be used as the target pixel set; if it is less than , it will be used as the background pixel set. The selection of threshold is particularly important for binarization processing, which requires accurate segmentation and fast operation speed [15, 16].

OTSU threshold algorithm is used to binarize the image. As an adaptive threshold confirmation method, the maximum interclass variance method segmented the image into target object and background region according to the gray characteristics of the image. After reasonably calculating the maximum value of variance between classes, the optimal threshold is obtained.

Set the gray level of the image as , grade pixel contains pixel blocks, and is the total pixel, so the probability of grade gray level presentation is . When is used as the gray threshold value, the background region that image pixels can be divided into according to the gray level is , and the target object is . Thus, the probability of and being presented is as follows:

In formulas (6) and (7), represents the gray level calculation coefficient.

The change of from leads to the maximum of becoming the optimal threshold, and expression (8) is as follows:

According to the calculation result of the best threshold in formula (8), judge whether the sports video image meets the image preprocessing target. When it meets the target, complete the image preprocessing and carry out the next analysis. In case of noncompliance, gray transformation, filtering, and other processing shall be carried out again until the image meeting the expected target is output.

4. OTSU Algorithm-Based Sports Video Image Target Motion Trajectory Enhancement Method

4.1. Build Sports Video Image Target Movement Model

In the process of sports video image target motion trajectory enhancement, it is necessary to analyze the principle of image target motion and build an image target motion model [17]. Because the trajectory enhancement results of CV model and CA model are more accurate, the above two models are used as the basis for the design of motion trajectory enhancement method. In the process of establishing the motion model, it should be noted that once the acceleration direction and velocity of the image target change greatly, a strong maneuver model should be established to ensure the correctness of the model motion [18, 19].

In this paper, the object of trajectory enhancement is the sports video image target. Since the image target always moves at uniform speed during the tracking process, the motion state will not change even if the speed changes substantially in a short time due to special circumstances [20]. To sum up, the motion state of sports video image target at moment is set as , and then, the motion model can be expressed as

The calculation result of motion state in the above formula depends on the running speed of the image target at that time and the position of the sports video image target. According to the calculation method of uniform linear motion, the position and speed of the image target are converted to obtain the relationship between them:

In formula (10), represents time change, represents position and velocity at , and represents Gaussian white noise. Based on the above formula, in-depth study is conducted to obtain the following CV model:

In formula (11), represents the image target motion state transition matrix, and represents the motion state estimate [21]. In the one-dimensional space, the motion mode of the target in the sports video image is uniform acceleration. Relying on the uniform acceleration linear motion formula, the position and velocity conversion formula of the target in the uniform acceleration motion mode is obtained:

In formula (12), represents the mean of uniform acceleration, and defines the moving state of the target in sports video at time. Formulas (11) and (12), respectively, represent the most commonly used CV and CA motion models. Combining the above two models according to the actual situation, the sports video object can have higher trajectory enhancement accuracy under the conditions of uniform motion and uniform acceleration.

4.2. Extracting Edge Information of Sports Video Images in the Process of Motion

The accurate tracking of sports video image is the precondition of precise control. For moving objects, the quality of captured motion image is lower than that of static objects. Therefore, some means are needed to make the image clearer, which is of great significance for precise control of sports video images [22, 23]. In the process of trajectory tracking, the accuracy of edge information extraction is directly related to the tracking accuracy. Generally, interpolation algorithm is used to process edge pixels. Traditional linear interpolation algorithms only use a single kernel function to convolve the edge pixels in the process of image interpolation, which will make the edge contour of the processed sports video image loss, and the contour of the image appear jagged or even staircase blurred. Therefore, this paper uses several interpolation algorithms to extract the edge information in the process of sports video image motion to make up for the shortcomings of the traditional algorithms [24].

In this paper, the first order differential profit operator with orientation is used to detect the edge of the original image, and the edge of the original image is filtered by the edge difference between the neighboring pixels [25]. To simplify the calculation, abstract the pixels in the motion trajectory image of a sports video image and combine the operators in two directions in eight directions to form a template in four directions, as shown in Figure 3.

In Figure 3, the labeled white points are pixels to be interpolated and replaced, and the black points are useful information pixels. In this paper, one information point is determined as the principle for marking. The traditional interpolation algorithm uses the change rate of the point directly adjacent to the pixel to interpolate. This paper focuses on the interpolation algorithm with edge direction and information direction. Interpolation calculation for lost pixels needs to be carried out according to the direction of edges [26]. According to the characteristics of cubic convolution and bicubic interpolation, first mark all edge pixels and judge their edge directions. The combined strong edge directions are shown as dotted lines in the figure above. Dotted lines D1 represent 90°, D2 45°, D3 135°, and D4 180°. Convolution in these four directions. Four points of were selected to obtain pixel information along the cubic convolution in the neighborhood, and the points to be replaced were interpolated. For nonedge points, 16 points of were selected to perform bicubic interpolation to obtain pixel information of the points to be replaced. Considering the maneuverability of track enhancement in the actual process, this paper reduced the resolution of the original track map, converted RGB color space into YCbCr color space, and carried out edge detection by Y component, Cb component, and Cr component, respectively, to extract edge feature information of the image and mark edge points. Then, the pixels to be interpolated and replaced are divided into textured direction pixels and weakly textured pixels, that is, the points in the edge region and the inner region of the contour [27]. The lost pixels estimated by the first scan and interpolation were denoted as , the lost pixels estimated by the second scan and interpolation were denoted as , and the lost pixels estimated by the third scan and interpolation were denoted as . To point at the edge of the area, determine whether edge matching, if match, then judge the edge direction, there is one direction and then through the edge direction do cubic convolution interpolation calculation, if you do not match or no direction and edge, then make use of information within the neighborhood of double three interpolation, and outline the internal point, and use the neighborhood pixels in the information to do the interpolation calculation, Finally, according to the bilinear interpolation method, the outermost edge of the image is filled, and the color space is converted from YCbCr to RGB.

In order to more effectively select the image edge, before extracting the edge of the original image to the original image with Gaussian filtering, probably the contours of the image after removing image noise, using the gradient in the direction of the four templates and image matching, will be multiplied by the coefficient gradient template in the image with the corresponding pixel values and will be the product of the add operation. Until all the convolution operations in the image are finished, the maximum value is selected from the convolution, and the threshold value is set according to the image. When the maximum value of convolution is greater than the threshold value, the point is the edge point, and the method is used to judge, and all edge points are marked after the extraction of edges. Through the above steps, the extraction of edge information in the process of sports video image movement is completed [28].

4.3. Trajectory Enhancement Algorithm

In this paper, binocular stereo vision technology is taken as the core, according to the image target motion model and the current space coordinates, the motion trajectory is calculated, and the target motion trajectory parameters of sports video image are obtained. According to the motion trajectory parameters, the motion trajectory enhancement algorithm is generated. Using the track length of sports video image target and the running speed parameter of image target , the above two parameters are taken as initial state parameters and put into the calculation formula of Kalman filter [29, 30]; the calculation formula of initial state of sports video image target is obtained as follows:

In formula (13), represents the initial position of the image target, and represents the running speed of the image target. According to the calculation results of the initial state of the image target, the Kalman filter algorithm is used to obtain the real-time running trajectory enhancement results of the image target [31].

The initial motion state of the sports video image target calculated based on formula (14) is analyzed to predict the motion state of the image target, so as to obtain the ship running track. However, due to the noise in the prediction process, the current noise situation needs to be represented by covariance matrix :

In formula (14), represents the covariance of noise, and the measured value of the movement track of the image target at the current moment needs to take observation noise into account.

The relationship between the observed value and predicted value of the trajectory of the image target is expressed as , and then, the calculation formula is

In formula (16), represents the position of the image target, and represents the variance of the observation noise. Based on the relationship between the predicted results and the actual trajectory, the measured values of the movement trajectory of the image target are updated:

Covariance update formula is

The measurement and update steps of Kalman filter are comprehensively considered to achieve the correction of target motion trajectory of sports video image until the optimal convergence state is reached [3234]. In view of the defect that Kalman filter algorithm presents low accuracy of enhancement target in the process of trajectory enhancement, this paper applies the strong enhancement method to improve the accuracy of image target trajectory enhancement [35, 36]. In the strong enhancement method, attenuation factor and suboptimal scale factor matrix are used to represent the improved covariance:

In formula (19), is recorded as

The attenuation factor is introduced into the above formula, and the Kalman filter algorithm is improved by the strong enhancement method to enhance the convergence of the trajectory enhancement algorithm and improve the accuracy of image target trajectory enhancement.

5. Experimental Results and Analysis

In order to verify the efficiency of the proposed method, the experimental platform adopts MATLAB 2020b, the hardware environment is 8GB memory, and Intel core TM i7-6700 CPU and the computer with main frequency of 3.4GHz are the experimental test platform. The experimental sample is the video image of football game. The proposed method is compared with the target trajectory extraction method based on fast R-CNN proposed in reference [3] and the motion trajectory tracking method based on laser information and radio frequency identification proposed in reference [4] in terms of edge information extraction time, moving target recognition accuracy, and target motion trajectory extraction accuracy.

5.1. Detection Time Comparison

In order to further verify the effectiveness of the proposed method, the trajectory detection time is first tested, and the specific experimental results are shown in Figure 4.

As can be seen from Figure 4, the maximum running time of the target track extraction method based on Faster R-CNN reaches 9 ms, and the maximum running time of the motion track tracking method based on laser information and radio frequency identification reaches 5 ms. With the increasing number of images, the running time of the proposed method always does not exceed 2.5 s. Through comparative analysis, it is proved that the proposed method can save a lot of time for trajectory detection and has obvious advantages in practical application.

5.2. Edge Extraction Effect of Moving Target

In order to verify the effect of the proposed sports video image moving target trajectory enhancement method, three methods are used to extract the target edge of the image samples. The results are shown in Figure 5.

It can be seen from Figure 5 that the target trajectory extraction method based on fast R-CNN and the motion trajectory tracking method based on laser information and radio frequency identification do not extract the edge of the object in the image completely and lack the edge of the image. The proposed method can not only accurately extract the edge information but also effectively remove the noise in the image. It shows that the edge extraction effect of the proposed method is good.

5.3. Comparison of Trajectory Enhancement Results of Different Methods

In order to verify the stability of the motion trajectory enhancement method of sports video image designed in this paper, the motion trajectory enhancement method of sports video image based on faster R-CNN, the motion trajectory tracking method based on laser information and radio frequency identification, and the motion trajectory enhancement method of sports video image designed in this paper are adopted, respectively, to enhance the motion trajectory of sports video image, and the results of motion trajectory enhancement of sports video image by different methods are shown in Figure 6.

According to Figure 6, you can see that based on the faster R-CNN method to extract the target trajectory and based on the laser information and radio frequency identification method of trajectory tracking of sports video image movement, results are improved and desired trajectory is of different degree; among them, based on the laser information and radio frequency identification method of trajectory tracking, deviation degree is small; based on faster R-CNN method to extract the target trajectory, deviation degree is of the largest, and in this paper, the design of the sports motion video image enhancement method of video image motion enhancement has almost the same results with the desired trajectory; therefore, in this paper, the design of physical motion video image enhancement method can accurately track the desired trajectory, and the accuracy is better.

5.4. Target Motion Tracking Error

The target trajectory extraction method based on fast R-CNN, the motion trajectory tracking method based on laser information and RFID, and the motion trajectory enhancement method of sports video image designed in this paper are used to enhance the trajectory of two moving target test points in sports video image, and the trajectory tracking curves of moving target test point 1 and moving target test point 2 are obtained, as shown in Figures 7 and 8.

According to Figure 7, during the track tracking test of moving target test point 1, the deviation of target track extraction method based on fast R-CNN and motion track tracking effect based on laser information and RFID is small, but the motion track enhancement method of sports video image designed in this paper is closer to the desired track. The main reason is that the design method in this paper adopts gray transformation. Median filtering and binarization are used to preprocess the sports video image, which improves the trajectory tracking accuracy.

According to Figure 8, the target trajectory extraction method based on fast R-CNN and the motion trajectory tracking method based on laser information and radio frequency identification have large deviations from the expected trajectory. The main reason is that the change of moving target point 2 shows great randomness. The trajectory enhancement method of sports video image designed in this paper is that the trajectory tracking curve of moving target test point 2 is closer to the expected trajectory.

Based on the above analysis, the tracking result of the sports video image motion trajectory enhancement method designed in this paper is more reliable and stable.

5.5. Video Target Trajectory Enhancement Effects of Different Methods

In order to verify the effect of different methods to enhance the trajectory of a video object, a soccer match video image is randomly selected to track and enhance its soccer trajectory. The validity of the method is judged by whether the dots are on the trajectory.

According to Figure 9, the proposed method can track and enhance the target trajectory of sports video image with high precision, and the football target point can be accurately attached to the trajectory. However, the target trajectory extraction method based on fast R-CNN and the motion trajectory tracking method based on laser information and radio frequency identification have different degrees of deviation.

Based on the above experimental results, the trajectory enhancement method of sports video image designed in this paper can accurately track the desired trajectory, with small trajectory tracking error, good stability, and high trajectory enhancement effect.

6. Conclusion

In order to solve the problem of low segmentation precision and insignificant enhancement effect of target trajectory in traditional sports video image tracking method, a new method based on OTSU is proposed. Through 3D histogram, OTSU algorithm is optimized to overcome the influence of noise and nonuniform illumination and to optimize the segmentation accuracy of moving objects. Through gray level transformation, median filtering, and binarization, sports video images are preprocessed. The object motion model of sports video image is constructed, the edge information in the process of sports video image motion is extracted, and the target trajectory is calculated finally. Experimental results show that the proposed method is more efficient, the segmentation accuracy of sports video is higher, the target trajectory is significantly enhanced, and the trajectory tracking accuracy is better. This study has certain value and can provide reliable theoretical basis for scholars in related fields.

Data Availability

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Conflicts of Interest

The author declared that there are no conflicts of interest regarding this work.