Abstract
With the development of the times, traditional fitness running data collection methods have shown a series of problems, such as low data accuracy, slow collection speed, and insufficient search ability of modeling algorithms. These problems have led to a sharp rise in the difficulty of building a more accurate fitness running data model. Therefore, this study proposed a set of methods for fitness running data collection and evolution modeling based on digital information technology. First, it filters out the impulse noise during the transmission of the three-dimensional acceleration signal caused by the accidental state change of the technology, by using the data collection method for fitness running based on the digital information platform, relying on a multisensor and median filter algorithm. Second, it filters out the gravitational component of this platform based on the orientation sensor. Then, according to the results of the first step, an algorithm is constructed for the model of fitness running data evolution. The purpose is to expand the search range so that the fitness running data model with better performance can be obtained more effectively. The experimental results show that the method based on digital information technology designed in this study has an overwhelming advantage over the traditional acquisition method, the constructed model is more accurate, and the efficiency of data acquisition is increased by 88%. Therefore, with the aid of digital information technology, the design of the data model can reduce the probability of mold repair and the time consumption of mold trials and reduce the design cost.
1. Introduction
With the rising tide of “national sports,” people’s awareness of caring for their own health has gradually increased and they have invested in the army of daily sports. It is worth noting that fitness running stands out among all sports. Because of its advantages of not being limited by time and space, fitness running has been warmly welcomed by the masses. The method of this exercise is very simple, suitable for all ages, with high flexibility, and the effect of fitness is relatively fast. Some companies saw this profit point and produced a batch of fitness running systems with the goal of improving the fun and entertainment of fitness running. These systems use the functions that come with the device, such as GPS positioning, direction indication, and output relevant signals of the user’s current state of movement through sensors. After a series of processing, users can see their step length, frequency, speed, and other information during the fitness running process in the system. This information is usually presented in the form of data, and the statistical data has a certain period, such as weeks, months, and years. On this basis, users can not only save it as a picture but also freely share it on any social media platform.
Although a considerable number of enterprises and researchers have achieved certain experimental results in the data collection and modeling of fitness running, there are still some shortcomings. When it comes to very large model space, the fitness running data modeling algorithm proposed in this study plays an important role, which can search and obtain fitness running models with better performance in this model space. Digital information technology has entered people’s lives unknowingly. There are countless products developed through this technology, but there are very few researches on the application of this technology in fitness running. Therefore, this study will integrate digital information technology into the construction of a fitness running data model, which is also one of the innovations of this study.
2. Related Work
There are a series of studies on digital information technology and data collection, and some scholars have made corresponding analyses.
Waldfogel pointed out that digitization was disrupting many copyrighted media industries, including books, music, radio, television, and film. Once the information was in digital form, it could be copied and distributed at almost zero marginal cost. This change fostered piracy in some industries, which in turn made it difficult for commercial sellers to continue bringing products to market in the traditional way that generated the same level of revenue. In general, digitalization increased the number of new products created available to consumers. Furthermore, due to the unpredictability of product quality, the growth of new products has led to a substantial increase in the quality of the best products [1].
Kang et al. studied four local rules related to the lower bound, upper bound, Khalimsky, and Marcus Wyse (respectively, referred to as L-, U-, K-, and M-) topology. They were used to digitize L-, U-, K-, and M-subspaces of Euclidean Nd spaces into digital topological spaces. While L-, U-, and K-digitizations had been shown to preserve the connectivity of objects, M-digitization had certain limitations in preserving connectivity (CP- for short). This method can be used to study applied topology and computer science [2].
Royakkers et al. discussed the social and ethical issues brought by digitization based on six mainstream technologies: IoT, robotics, biometrics, persuasive technology, virtual and augmented reality, and digital platforms. Royakkers et al. also highlighted many developments in a digital society. These developments appeared to be at odds with six recurring themes revealed by analysis of mainstream techno-scientific literature: privacy, autonomy, security, human dignity, justice, and the balance of power. This study showed that the new wave of digitalization was putting pressure on these public values [3].
Kulhav and Lepsik presented a method for 3D scanning and subsequent data reconstruction that focused on the digitization of highly structured composite parts in the creation of fully functional CAD and FEM models. The digitized parts are structured carbon composites replicated from steel templates. The purpose of this study is to evaluate appropriate methods for creating geometry and to compensate for inaccuracies in numerical simulation, especially the meshing and convergence of the solution model. As has been found very surprisingly, some local small-step changes in geometry may be easier to solve than complex and highly approximated regions [4].
Chen et al. presented a laser-based noncontact stylus for fiducial registration and digitization in surgical navigation. Adding a laser pointer with the space measuring device and using the laser beam as a means of locating the fiducials in three-dimensional space, Chen et al. developed a method for aligning the orientation of a laser beam relative to the tracking target to which it was attached. Digitization of fiducials was described as a line intersection problem, while registration was described as a point-to-line registration problem. It was found that contact-based rigid registration outperformed noncontact registration in terms of TRE under near-ideal conditions [5].
Maiti and Kayal discussed the impact of digitization on India, finding that India’s economic growth rate had risen significantly, mainly due to digitization, which automated products and processes to improve quality and yield. Digitization improved MSME performance and provided MSMEs with additional financing options. This helped reduce financial barriers and increased access to alternative financing, leading to significant improvements in MSMEs’ operating performance, profitability, and productivity. This study found that digitalization had a large impact on the inclusive growth of India’s overall economy and trade [6].
Kang studied how the government managed its skilled immigration in the information age and investigated the rationale, technology, and imagery of the government’s use of the Internet. For this research, Kang used interviews, participant observations, and literary analysis to explore London’s professionals and their links to the government. The findings suggested that these skilled migrants were governed primarily by two patterns of Internet models. Digitized diaspora associations played a key role in mediating government influence on the diaspora [7].
3. Digital Information Technology and Fitness Running Data Collection and Modeling
3.1. Digital Information Technology
3.1.1. Digitization Introduction
The current era is the information age, and the digitization of information is increasingly valued by researchers. As early as the 1940s, Shannon proved the sampling theorem, that under certain conditions, a discrete sequence can completely represent a continuous function. In essence, the sampling theorem lays an important foundation for digital technology. Digitization is the product of the continuous development of information technology and another technological revolution in the new era [8]. It refers to the technology that uses CAD/CAM as the main tool to carry out a series of operations through network technology and software and hardware facilities. CAD stands for computer-aided design and CAM stands for computer-aided manufacturing [9]. The process of enterprise transformation based on digital applications is shown in Figure 1.

In the basic process of digitization, the first step is to convert a large amount of very complex and ever-changing information into numbers or data, which can usually be measured by some tool or method. Then, it uses these numbers or data to build a corresponding and reasonable digital model. In this model, it is a series of binary codes that have been changed. Finally, these codes are put into the computer for centralized and unified processing [10].
3.1.2. Information Technology
With the rapid development of big data technology, the update speed of information is also gradually increasing [11]. Digital information technology is an emerging technology in modern society, which mainly refers to the information to be transmitted into the computer and stored by electronic means. Informatization and digitization have different meanings. The former mainly refers to the automatic processing of information, while the latter is the important basis and manifestation of the former. This technology has a very critical use value and has been widely used in many fields [12]. It has two main characteristics, namely, automation and digitization, and automation is more important than digitization. Automation refers to the process of automatic detection, information processing, analysis and judgment, manipulation and control of machinery, equipment, systems, or processes to achieve expected goals according to human requirements without the direct participation of people or fewer people. Figure 2 shows the application of information technology and digital security.

3.2. Fitness Running and Data Collection
3.2.1. Characteristics and Development of Fitness Running
With the rapid development of urbanization, the urban activity space is constantly being compressed, and the outdoor exercise environment is deteriorating day by day. Therefore, fitness running on a treadmill has become a mainstream fitness method. Fitness running is a sport that can improve a person’s willpower, endurance, and physical fitness. It is an aerobic exercise. Studies have shown that it can effectively improve the function of the human respiratory system, and as long as the exercisers can persist for a long time, they will definitely be able to achieve the purpose of physical fitness [13].
Today, adherents of this project are evenly distributed all over the world, and there are many factors that affect the human body, such as age characteristics and exercise intensity. In general, the measure of exercise intensity is mainly heart rate. The biggest feature of fitness running is that it needs to consume a lot of oxygen, and it needs to be carried out rhythmically while maintaining uninterrupted. It can be summed up in three words, namely, “long, slow, and far.” Its specific operation skills are as follows: the pace should be brisk and elastic, the forefoot should be soft on the ground, and the stride should be small [14]. Figure 3 shows the approximate pattern of fitness running.

Although its general exercise intensity is not very large, once the exercise time increases, the energy consumption of the body will increase, and the practice process will be relatively boring, so it is necessary to optimize the effect of fitness running. Table 1 shows the course duration structure of fitness running according to the physiological characteristics of people.
3.2.2. Data Collection of Fitness Running
In general, the common method for data collection of fitness running mainly includes three steps, namely, signal acquisition, double-pass filtering, and signal output. The main process is shown in Figure 4.

First of all, the sensor that comes with the device transmits the data signal to the three-dimensional coordinate system and also includes the acceleration signal with a certain gravitational component. This coordinate system is based and measured on the screen of the device and is specifically expressed as follows: the center of the screen is the origin of this coordinate system; the side parallel to the width of the screen is the positive x-axis; the side parallel to the length of the screen is the positive y-axis and; the side perpendicular to the top of the screen is the positive z-axis [15]. Next, the three-dimensional acceleration signal containing the gravitational component can be transmitted to the double-pass filter, so as to complete the screening of the gravitational component. Then, the Kalman filter is used to transform the form of acceleration into the form of velocity, and finally the output result of the data signal is obtained [16]. The HLF shown in Figure 4 plays a very critical role, and it can effectively enhance the accuracy of the data signal. On the basis of the parametric model proposed by some scholars, the algorithm proposed in this study can be used to search and obtain the optimal solution of the fitness running data model. The definition formula of the final constructed model is
Formula (1) describes the relationship between the heart rate, exercise intensity, and running speed of fitness runners. This relationship has nonlinear characteristics. , , and , respectively, represent the increase in heart rate and exercise intensity when a person increases from 0 to time r in an extremely stable state, and the average acceleration at time r. , , , , and are the main five real-number parameters. These five parameters need to be determined according to the specific situation, and generally do not take negative numbers. According to these two formulas, we can get the statistics of the heart rate and speed of fitness runners in a specific period of time. Therefore, the algorithm is used according to the criterion of minimizing the squared deviation, which is based on the increase in heart rate, so as to obtain the values of the last five parameters to be estimated [17].
3.3. Fitness Running Data Collection Algorithm
In the era of big data, in order to obtain information, it is first necessary to collect information and process the data. Data processing refers to processing the collected data by technical means such as streamlining, noise reduction, and splicing, so as to minimize the amount of data and improve the subsequent work efficiency without affecting the accuracy of the data. However, a large amount of data generated in real life and online life is not directly presented in front of people. These data are often disguised and encrypted on themselves to prevent them from being used by those with intentions. Therefore, in order to make life more convenient and realize intelligence and automation, it is necessary to collect a large amount of data and then analyze the underlying logic behind the data on this basis to obtain the most essential characteristics of the data [18].
3.3.1. MMFA
The full name of MMFA is a data acquisition method based on multiple sensors and median filtering, that is, multisensor and median filter for data acquisition in fitness running. Next, the overall process of this method is introduced. As the name suggests, this method mainly uses sensors that can guide the direction. Next, the overall process of this method will be introduced. As the name suggests, this method mainly uses sensors that can guide the direction. The sensors are used to obtain the signals of the state changes displayed by the athlete’s equipment during the fitness running process, and filter out the gravitational component in the three-dimensional acceleration sensor signal of the device by using the output signal. The (direction) signal of the 3D acceleration is then denoised and smoothed using the median filter tool [19]. The specific flow of the method is shown in Figure 5. The figure shows two methods, namely, median filter and gravitational acceleration filtering, which are the main difference between MMFA and HLF.

3.3.2. Median Filter
The median filter is widely used in the field of image processing, and it can be used for one-dimensional signals. The basic principle of the median filter is that if a signal is smoothly varying, the output value at a point can be replaced by the statistical median of all values within a certain size neighborhood of that point. In general, the median filter plays an important role in the device. It is mainly used to reduce the probability of accidental state changes of equipment and the impulse noise of transmission signals, thereby providing a comfortable exercise environment for exercisers. It should be noted here that in the process of fitness running, the equipment is mainly fixed on the waist of the athlete [20].
These two formulas reflect the value of the acceleration of the transmission signal at time r after passing through the median filter. In formula (4), k and , respectively, define the size of the median filter window and the set of acceleration signal samples in the window. Med means to take the middle value of the set of samples, which is specifically expressed as arranging the acceleration signal samples in the set. The arrangement order is from small to large, and then the middle value is taken after the arrangement, that is, the signal value of the n + 1th sample.
3.3.3. Gravitational Acceleration Filtering Combined with Direction Signal
A three-dimensional coordinate system is mentioned above, the signal value output by the three-dimensional acceleration sensor of a device is based on this three-dimensional coordinate system, and it also includes the gravity component value. In the process of fitness running, the state change of the equipment will cause a small deflection of the coordinate system, which in turn also changes the influence of gravity on the output signal of the three-dimensional acceleration sensor [21]. The direction sensor that comes with the device uses the relative position of the device coordinate system and the inertial coordinate system to output the signal when the state of the device coordinate system changes [22]. Figure 6 shows a schematic diagram of the coordinate system used to describe the motion state of the device.

The latter two are the device coordinate system and the inertial coordinate system, respectively. The center points of the two coordinate systems are completely coincident. Among them, the pitch angle coordinate axis in the inertial coordinate system is positively tangent to the ground, and the direction is due west; the azimuth side navigation angle coordinate axis is perpendicular to the ground, and the direction is also due west; the roll angle coordinate axis is also tangent to the ground, and the direction is due north. In a simple 3D system, Euler angles are generally used to represent the orientation of rotation, which are divided into three types: pitch, azimuth, and roll [23].
As can be seen from the figure, the directions of the first coordinate system and the third coordinate system are parallel. The first coordinate system is called the world coordinate system and it is absolute. It mainly describes the position information of each point on the earth according to the two parameters of latitude and longitude and absolute elevation (that is, the distance from a point along the direction of the plumb line to the absolute horizontal plane). It is generally regarded as a constant on the azimuth coordinate axis, in order to reflect the influence of the gravitational acceleration G on the acceleration signal in the device coordinate system [24].
According to the principle of space geometry, it can be known that the state change of the device will cause the coordinate system to change. Therefore, the state changes of the side angle, pitch angle, and roll angle in the equipment coordinate system can be described by the inertial coordinate system. The matrix formula of this change is
These three formulas mainly illustrate the following: the direction sensor that comes with the device connects the two tools, the acceleration sensor and the geomagnetic sensor, so as to transmit the direction signals , , and to the outside world. According to the direction angle signal, the gravitational acceleration G in the inertial coordinate system can be transmitted to the device coordinate system through a series of operations. Thus, the component signal of the gravitational acceleration G on the x-axis, the component signal on the y-axis, and the component signal on the z-axis can be obtained, so the coordinate transformation formula can be obtained:
Finally, the signal output from the three-dimensional acceleration sensor is filtered out and transformed, and a new gravity component is obtained, thereby the data value of the acceleration signal of the athlete can be calculated.
4. Comparison of Fitness Running Data Sets
This study aims to verify the effectiveness of the fitness running data collection and evolutionary modeling algorithms proposed in this study. Next, we will first give a brief introduction to the data set of this experiment, and then explain the problems and measurement parameters that need to be verified, and then use the statistical methods we apply to achieve the final results. It should be noted here that a dataset is a collection of data, usually in tabular form, where each column represents a specific variable, and each row corresponds to a question of a member of the dataset. People use a common fitness running guidance system on the market, referred to as i Fitness, which is mainly based on Android smartphones and heart rate belts. During the athlete’s fitness run, the data acquisition module of i Fitness will store a series of data on the cloud server every minute. This cloud storage runs in real-time. These data mainly include three-dimensional acceleration data collected from smartphones, data from orientation sensors, and heart rate data collected from heart rate belts.
Next, in order to verify the algorithm of this study, we selected ten volunteers from the college student population and asked them to perform fitness running. The specific conditions of these ten volunteers are shown in Table 2. To protect personal privacy, we use P1∼P10 to replace their real names, with five males and five females. Each volunteer wears the heart rate belt provided by the experiment and is equipped with an Android phone installed with the i Fitness system. During the experiment, people chose a professional treadmill to obtain various data from the volunteers during the fitness running. After training them in a series of equipment use, let them start the experiment, which lasts for 30 minutes, and the whole experiment is carried out in a sports experiment center for sports. After the 30-minute experiment, the three-dimensional acceleration, orientation, and heart rate data of the volunteers stored in the cloud were extracted and compared with the specific data displayed on the treadmill to form the corresponding data set of the ten volunteers.
In the process of fitness running, the average absolute error of each volunteer’s running speed of is used as a measure of data accuracy. Its definition formula is
Here, M is the number of cycles observed in the experiment, refers to the actual speed of the volunteer in the ith minute displayed on the treadmill, and refers to the speed value of the i-th minute calculated by the MMFA algorithm proposed in this study for the i-th row of data. This formula shows that the smaller is, the better the experimental effect of the algorithm proposed in this study is.
In order to ensure the fairness of this experiment, the speed and heart rate data displayed on the treadmill are used as the data basis for constructing the exercise model of fitness running. Based on the data set obtained above, we can run the algorithm on the cloud server to construct a volunteer fitness running exercise model. In this way, when the i-th minute velocity value of an object is known, people can use the model to estimate the object’s heart rate value. In the process of fitness running, the average absolute error of each volunteer’s heart rate is used as a measure of the accuracy of the model. Its definition formula is
From the formula, people can see that the smaller is, the better the model based on this algorithm is. In the formula, M is the period of experimental observation, refers to the heart rate value measured at the ith minute, refers to the heart rate value in the ith minute estimated according to the constructed model.
Based on the above formula, and according to the optimality condition, the above data are normalized and we can obtain:
Assuming that there is a certain classification set that can make the two sample data sets be correctly classified, this problem is solved by searching for the set that maximizes. The expression of the classification set is
In the previous formula, refers to the weight vector and a refers to the deviation of the measured value from the actual value.
To find the optimal classification set, it is necessary to satisfy the condition that the modulus of the normal vector q of the classification set is the smallest, so the problem becomes the search space of the optimal model:
For the optimal solution of the above formula, the optimization method can be used to convert the above problem into a dual problem, and the factor matching the single sample data can be substituted into the above formula, we can obtain
To find the minimum value of q and a, we need to differentiate them once, so that their result is 0, so we can obtain
In this case, the problem of solving the optimal model search space is transformed into a quadratic programming problem of solving the dual, namely,
People introduce the concept of mean absolute error to reflect the size of the error between the actual situation and the prediction. It refers to the average degree of the absolute value of the deviation of all individual observations from the arithmetic mean. In the process of fitness running, the data of the index value of the average absolute error of the speed of ten volunteers are shown in Figure 7. The figures show the calculation results of the two algorithms, MMFA and HLF, respectively. From this figure, we can intuitively see that the effect of the MMFA algorithm is obviously better than that of HLF. The difference between the two mean absolute errors reached a maximum of 0.57. This conclusion shows that MMFA takes into account the interference of noise on the signal, but the HLF algorithm ignores this, so MMFA has obvious advantages.

Select a certain number of normal samples from the data set and verify them according to different ratios, namely, 1 : 1, 3 : 1, 5 : 1 and 10 : 1. In this way, the effectiveness of the algorithm proposed in this study is detected, and the average value of the obtained results is calculated as the final evaluation sample, so as to avoid the interference of accidental factors to the algorithm. In addition, we introduced the concepts of F-value and G-value in our experiments, which are used to measure the classification effect of the sample and evaluate the overall classification effect of the dataset, respectively. The test results are shown in Figure 8. From the figures, we can see that after the optimization of the MMFA algorithm, the effect is very obvious, and the F value is up to 0.76. Therefore, it can be concluded that the MMFA algorithm has great practical significance.

(a)

(b)

(c)
Given the randomness of the MMFA algorithm, we added another 20 experiments to each dataset. However, for the HLF algorithm with strong deterministic characteristics, only one experiment is added, and the experimental results are shown in Figure 9. From the figures, we can see the situation of the mean absolute error of the volunteers’ heart rate, including the mean, best, worst, and median of the statistical data. According to this figure, we can conclude that except for the worst value of volunteer P1, in other cases, the effect of the MMFA algorithm is better than that of the HLF algorithm, and the average difference is at most 2.5.

Based on the figures, people can obtain the rank sum value of the two algorithms, that is, compare the two samples, and the results are shown in Table 3. From the table, we can see that the assumed value of all volunteers is less than 0.05, and the value of the effect size is 1. This shows that the effect of the MMFA algorithm is much better than that of the HLF algorithm, and it also reflects that the model search space of the MMFA algorithm is larger and more accurate than the HLF algorithm.
Under the model constructed in this study, these two algorithms are used to expand the experimental objects, and then the data collection time is counted, and the results are shown in Table 4. According to the table, we can see that when there are not many subjects, there is little difference between the two, but when the subjects are gradually increased, the performance of the MMFA algorithm is significantly better, and the efficiency is increased by 88%.
Finally, people use the MMFA algorithm to randomly select some students and divide them into an experimental group and a control group to verify the effectiveness of the algorithm. The experimental results are shown in Figure 10. Because of the differences in the physical characteristics of male and female physical fitness, we use two general running events of 1000 meters for boys and 800 meters for girls.

As can be seen from the graph, before the start of the experiment, the experimental group performed about the same as the control group. After the training of the model and exercise program constructed by the MMFA algorithm, the performance of the experimental group significantly surpassed that of the control group, with a great improvement, and the maximum performance improvement reached 11.1 s. Therefore, this shows that the algorithm has a good effect on the modeling of fitness running.
Through the above series of experiments, we can know that by capturing the frequency of data, people can clearly know the basic characteristics and fluctuations of the data. The acquisition based on this data can restore the waveform of the data to the greatest extent and ensure the integrity of the data. The methods of data collection can be continuously evolved and upgraded, and finally, a method that keeps pace with the times is obtained. As the environment changes and data objects change, the way of data collection also needs to be upgraded to maintain its unique ability to perceive data.
5. Discussion
Through the analysis of fitness running data collection, relying on digital information technology, this study analyzes the method of fitness running data collection, namely, the MMFA algorithm, according to the needs of different groups of people, and optimizes the existing problems accordingly. This study analyzes the shortcomings of traditional data collection methods and builds a fitness running data evolution model, which points out the direction for subsequent research. At the same time, the concept introduction and experiments are used to optimize the parameters, which effectively ensures the overall detection performance of the algorithm in unbalanced samples. This not only solves the problem that performance is obviously affected by parameters but also ensures the operating efficiency of the model. The results show that the exercise program generated by the fitness running data model designed in this study has a good effect, which can keep the heart rate of the athlete in a safe and stable fluctuation during exercise. And most importantly, the exercise regimen can approximate the maximum intensity of aerobic exercise.
6. Conclusions
This research creatively proposes a fitness running data collection algorithm based on digital information technology. From the experimental results of the study, we can see that it is feasible to build a digital fitness running data model, which meets the needs of the public for sports and fitness. Moreover, the scheme generated by the model has certain scientificity, safety and efficiency, so the construction of the model can be taken into consideration. However, in the specific experimental process, there are some data with little deviation from the actual situation, but this study does not carry out an in-depth analysis on it. Therefore, in the future, the scope of the experimental subjects can be expanded, and some targets of different ages can be added, so that more effective data can be collected. In order to test the stability of the modeling method in this study, it can provide a certain reference and effective support for customizing the personalized fitness running exercise planning and designing a more complete fitness running training process monitoring system.
Data Availability
No data were used to support this study.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.