Abstract

To establish a suitable pure electric bus arrival time prediction model, this paper takes pure electric bus as the research object. Based on the analysis of the influencing factors of the arrival time of the pure electric bus, the BP neural network arrival time prediction model optimized by the firefly algorithm (FA-BP prediction model) is established by selecting vehicle type, SOC value, battery age, and time as input conditions. The model is trained and tested by using bus operation data. The root mean square error of the Kalman filter model is 0.351, of the BP neural network model is 0.059, and of the FA-BP prediction model is 0.04. The results show that the model in this paper effectively improves the prediction accuracy and has good reliability and feasibility. It can provide some theoretical references for pure electric bus operation and managers and provide some basis for improving bus reliability.

1. Introduction

In recent years, the urgency of motor vehicle pollution prevention has become increasingly prominent. As a new energy vehicle, pure electric vehicles can effectively reduce air pollution, and reduce fossil energy consumption, and have been widely promoted. The wide application of pure electric vehicles in public transportation has a good demonstration and leading role and can effectively promote the development of new energy vehicles. At present, most of the existing bus travel time prediction studies are for fuel buses, natural gas buses, and hybrid buses. The operating characteristics of the pure electric bus are quite different from theirs. The existing bus arrival time prediction model and reliability model have been unable to be better applied to pure electric bus, and it is necessary to establish a suitable pure electric bus arrival time prediction model and reliability analysis model. Therefore, it is necessary to research the travel time prediction of pure electric bus.

In terms of bus arrival time prediction, to provide passengers with real-time and reliable bus arrival time, the model is generally required to have better prediction accuracy and calculation speed. The pure electric bus is affected by weather, road conditions, and other complex factors during operation, which makes an accurate bus travel time prediction a problem. Scholars at home and abroad have invested a lot of time and energy in research to find more accurate and faster forecasting methods. Currently, the methods to predict the bus travel time include the following: historical data prediction method [1], statistical regression theory prediction model [2], time series method [3], Kalman filter model [4], artificial neural network model [5], support-vector machine model [6], probability-based prediction model [7], and particle filter-based prediction model [8]. Kumar et al. [9] proposed a generic data-driven approach for bus arrival time prediction which first learns both the spatial and temporal correlations in the historical data using supervised learning in a general nonlinear and nonstationary fashion and posed the prediction problem as a probabilistic inference problem under a nonlinear dynamical system model. Rahman et al. [10] developed a methodology to analyze the bus travel time distribution systematically based on different pseudo horizons, which takes into account the uncertainty of future bus arrival times given that early and late buses have their respective ramifications. Huang [11], respectively, studied the influence of static factors and dynamic factors on the arrival time of the bus and established a time series prediction model based on heterogeneous information. The model is based on the recurrent neural network to study the long-term dependence of traffic and realize the prediction of bus arrival time. Serin et al. [12] handled the travel time of buses between two consecutive stops s time series, the average method, the Holt-Winters method, and the sum deep residuals method. Combining the average method, the Holt-Winters method, and the sum deep residuals method, a novel method with three-layer architecture to predict bus travel time between two stops is proposed. Achar et al. [13] utilize the historical data to learn the nonstationary (a) (linear) spatial dependencies between travel times of adjacent sections based on the above-computed order and (b) temporal dependency between successive trips as a function of the time difference between the trips and used Kalman filtering algorithm to solve the prediction model. Lai et al. [14] proposed the use of a wavelet neural network (WNN) model with an improved particle swarm optimization algorithm (IPSO) that replaces the gradient descent method. The proposed IPSO-WNN model overcomes the limitations of the gradient-based WNN, which can easily produce local optimum solutions and stop the training process and thus improving the prediction accuracy. Liu et al. [15] solve the problems of remote dependence on bus arrival and road incidents, combining the advantages of the historical data prediction method and the real-time speed data prediction method. Thus, long short-term memory and artificial neural networks on the comprehensive prediction model are proposed based on spatial-temporal feature vectors. Yang et al. [16] divided the dwell time into linear and nonlinear parts and adopted the autoregressive integrated moving average (ARIMA) model and support vector machine (SVM) to predict these two parts. Thus, the hybrid dwell time prediction method for BRT is established. Yu et al. [17] presented a hybrid model to predict bus arrival times, based on a support vector machine and Kalman filtering technique. Results show that the hybrid model generally provides better performance than artificial neural network-based methods. Yu et al. [18] proposed a random forest based on the near neighbor (RFNN) method model to predict bus travel time.

Scholars at home and abroad have conducted in-depth research on the fuel bus arrival time prediction model, but the existing research still has shortcomings. The bus arrival time prediction method does not take into account the difference in influencing factors of pure electric bus, and the prediction accuracy needs to be improved.

2. Analysis of Influencing Factors on Operation Time Reliability of Pure Electric Bus

The factors that affect the operating time of pure electric bus vehicles are complex and diverse. In addition to relatively certain factors such as driver characteristics and vehicle characteristics, random factors such as date, time period, and weather can also have an important impact on bus operating time. Passengers are usually more concerned about the reliability and stability of bus operating time under random factors [19]. Sun [20] used a difference test to analyze the impact of date, time period, and weather on bus operation time. The results showed that random factors such as date, weather, and time period have a significant impact on bus operation time and should be considered as factors in the prediction model of bus arrival time. Pure electric buses provide operational power through battery storage of electrical energy, so battery performance determines whether pure electric buses can operate smoothly and reliably. He [21] analyzed that the battery performance of pure electric buses is mainly affected by factors such as battery capacity, battery age, and battery state of charge (SOC). Therefore, this paper selects four factors: pure electric buses vehicle type, SOC value, battery age, and time period to study their impact on operating time.

2.1. Vehicle Type

Due to different vehicle types, the vehicle quality, electric shock power, and other parameters are also different, resulting in large differences in the acceleration and deceleration performance of the bus. This paper selects bus stations with relatively close traffic volume during the off-peak period to investigate three types of pure electric buses, and the specific parameters are shown in Table 1.

Table 2 and Figure 1 show the deceleration time of the three types of pure electric bus is not very different during the deceleration process. The standard deviation of deceleration times of type A (10.7 m) and type C (7.0 m) is 1.31 s and 1.24 s, respectively, which are the maximum and minimum values of the three models. It shows that the deceleration time distribution of type C (7.0 m) is relatively uniform, while that of type A (10.7 m) is relatively dispersed. In addition, the average deceleration time of type C (7.0 m) is 6.38 s, while the average deceleration time of type B (8.0 m) and type A (10.7 m) is 6.61 s and 6.99 s, respectively. It shows that type C (7.0 m) has the best deceleration performance, followed by type B (8.0 m), and type A (10.7 m) is poor. The acceleration time characteristics of the three models of pure electric buses are similar to the deceleration time characteristics during acceleration, the standard deviation and mean of the acceleration time of type C (7.0 m) are the smallest, and the standard deviation and the average value of the acceleration time of type A (10.7 m) are the largest. It shows that the acceleration time distribution of type C (7.0 m) is relatively uniform, and the acceleration performance is the best. On the whole, type C (7.0 m) has the best acceleration and deceleration performance, followed by the B model (8.0 m), and type A (10.7 m) is poor.

Through the quantitative analysis of the relationship between acceleration and deceleration time of pure electric bus and vehicle type, it can be seen that the difference in acceleration and deceleration performance of different types of vehicles has a great difference in the impact on the bus operation time. The smaller the vehicle mass, the greater the electric shock power, and the better the acceleration and deceleration performance. Therefore, the vehicle type should be taken as an input variable of the pure electric bus arrival time prediction model.

2.2. SOC Value

This paper selects the operation time of the pure electric bus at Geyu Village-Youxizhou Bridgehead section of Route 321, Fuzhou City, from 14:00 to 16:00 during the off-peak period to analyze the influence of SOC value on the operation time. Figure 2 shows the change in bus operating time when the remaining power of the pure electric bus is different. In general, when the operating time of pure electric buses is between 0.4∼1 SOC value, the more sufficient the power, the shorter the running time, but the difference in running time is basically between 20∼30 S, and the difference is not obvious. However, when the SOC value drops below 0.4, the operating time of pure electric buses increases significantly, and the operating time of the station fluctuates greatly, and the battery is in a more unstable state. Therefore, judging the SOC value of pure electric buses will affect the arrival time of the bus.

In order to analyze whether the SOC value of pure electric buses has a significant impact on the bus running time, a random sample was selected by using ANOVA to analyze the different significance. Normality and variance homogeneity are the two prerequisites for the application of the ANOVA test, so randomly select 30 samples of the operation time of the Geyu Village-Youxizhou Bridgehead section when the SOC value is 0∼0.4, 0.4∼0.6, and 0.6∼1.0 for testing. Due to the small amount of data in the three groups, the K-S test results of the normality tester are the mainstay. Table 3 shows the asymptotic significance of the three groups of data so it is considered that the operation time of the three groups of pure electric buses is considered to have normality.

The homogeneity test for variance, also known as the Levene test, tests whether data from different groups come from the same population. If , the null hypothesis is accepted: the variance is homogeneous, and vice versa, the null hypothesis is rejected. Table 4 shows the test results of homogeneity of variance, is greater than that, indicating that the above three sets of data have homogeneity of variance. Table 4 shows the test results of homogeneity of variance because , indicating that the above three groups of data have homogeneity of variance.

In Table 5, the variance of three groups of data is analyzed. Because , the SOC value of pure electric buses will affect their arrival time. Therefore, it is thought that the SOC value of a pure electric bus vehicle will affect the arrival time, and the SOC value should be as an input variable of the pure electric bus arrival time prediction model.

2.3. Battery Age

Hu [22] carried out experiments on the battery under different conditions, combined with different mathematical algorithms to simulate and analyze the battery characteristics and extract the characteristic parameters.

Table 6 shows the management information of the Route 321 pure electric bus, recording the license plate number, length, battery age, and other information.

Statistics of the operation time of pure electric buses with different battery ages on the Route 321 section of Geyu Village-Youxizhou Bridgehead from 14:00 to 16:00 during the off-peak period. Figure 3 shows that, on the whole, the shorter the battery age, the shorter the operation time.

The bus arrival time corresponding to 3 years of battery age is longer than that of 0.5 years and 1.5 years of battery age, and the arrival time of buses with a battery age of 0.5 years is the shortest, so the battery age of pure electric buses is thought to affect the arrival time of the bus.

The K-S test method is used to test the normality of the above three sets of data, and the results are shown in Table 7. Because of the asymptotic significance of the three groups of data , the operation time of the three groups of pure electric buses is considered to have normality. Table 8 shows the test results of homogeneity of variances, indicating that the three groups of operation time corresponding to batteries with different service life have homogeneity of variance.

In Table 9, the variance of the three groups of data is analyzed. Because , it is believed that different battery ages of pure electric bus will affect the arrival time of pure electric buses, and the battery age should be taken as an input variable of the prediction model of the arrival time of pure electric buses.

2.4. Time Period

The impact of the time period on the travel time of pure electric buses is mainly reflected in two aspects: the density of road traffic flow and the number of passengers boarding and alighting. When the traffic density is large, the logarithmic model proposed by Greenberg can describe the relationship between velocity and density. As shown in Figure 4, the speed of traffic flow decreases with the increase of vehicle density over time, and when the vehicle density reaches 200 veh/km, the speed of traffic flow is 0.where is the maximum traffic volume speed, is the density, and is the jamming density.

The impact of the time period on the travel time of pure electric buses is mainly reflected in two aspects: road traffic flow density and the number of passengers getting on and off. During peak hours, urban residents have a high demand for commuting to and from work and school, high density of road traffic flow, and large interference of social vehicles on pure electric buses, resulting in low bus speeds and long travel time. At the same time, the number of bus passengers at the bus stop is large, and the time for passengers to get on and off is long, which is also one of the reasons for the long travel time of bus at peak hours. However, the road traffic flow density is relatively low in the off-peak period, the road is in a relatively smooth state, and the delay time for passengers to get on and off is also relatively short. Therefore, the travel time of pure electric buses is relatively short in the off-peak period.

Figure 5 shows the change of operation time between Geyu Village and Youxizhou Bridgehead Station on August 13, 2018, in the direction of Bus No. 321 to Yuandonglijing. Results show that the distribution of bus operation time occurs during the morning peak and evening peak between 7:00–9:00 and 17:00–19:00.

The K-S test is used to test the normality of the above two groups of data, and Table 10 shows the results. Table 10 shows the test results of the homogeneity of variance. The values of the off-peak period and peak period are 0.057 and 0.148, respectively, which are both greater than , indicating that the operation time of the off-peak period and peak period obeys the normal distribution.

Table 11 shows the test results of the homogeneity of variances. , indicating that the operation time of pure electric buses during peak and off-peak periods has a homogeneity of variance.

The variance of the two groups of data is analyzed in Table 12. Because , it is believed that time period will affect the arrival time of the pure electric bus, and time period should be taken as an input variable of the pure electric bus arrival time prediction model.

3. Pure Electric Bus Arrival Time Prediction Model

3.1. BP Neural Networks

BP neural network is a mature and widely used error backpropagation learning algorithm in prediction, which consists of three layers: input layer, hidden layer, and output layer. There is generally no connection between neurons in each layer, and the neurons in each layer are only fully connected with neurons in adjacent layers. BP neural networks pass input signals layer by layer to the output layer. When the obtained output value does not meet the accuracy requirements, the weight and threshold of the network are adjusted by back-propagating the error signal, and finally, the predicted value of the BP neural network gradually approaches the expected value. The BP neural network structure is shown in Figure 6.

In Figure 6, the output expression of the neuron in the input layer of the BP neural network iswhere represents the conversion function of the input layer.

In Figure 6, the output expression of the neuron in the hidden layer of the BP neural network iswhere represents the connection weight between the neuron in the input layer and the neuron in the hidden layer; represents the hidden layer threshold.

In Figure 6, the output expression of the neuron in the output layer of the BP neural network iswhere represents the connection weight between the neuron in the hidden layer and the neuron in the output layer; represents the output layer threshold.

3.2. FA-BP Prediction Model

Considering that the travel time of the pure electric bus is nonlinear, BP neural network has a strong nonlinear mapping function and self-learning ability, which is very suitable for the prediction of bus travel time, and the prediction accuracy is higher than that of the Kalman filter, support vector machine, and other models. However, BP neural network also has the problem of a slow convergence rate and often does not continue to calculate and search after getting a nonglobal optimal solution. Therefore, the FA algorithm with global training ability and fast convergence speed is used to optimize the BP neural network model.

In addition, this paper proposes that when (1) there is no better solution within the dynamic domain of fireflies, it is stipulated that fireflies should be iterated and updated according to the guidance strategy. (2) Set the step size to a value that can be adjusted. The step size is no longer a fixed value, and its size will change with the solving process. Furthermore, the accuracy, convergence rate, and search stability of the optimal solution of the FA algorithm are improved.

3.2.1. Guided Movement Strategy

The guided movement strategy means that in the process of iteration, the optimal firefly individual is searched, and the position of the firefly is the optimal position at present. Then, the firefly will update its position according to the following formula:where represents the position of the firefly at the moment; represents the position of firefly at time , represents the adaptive step size, represents the position of firefly at time , and represents the range of the dynamic decision domain of firefly .

Firefly has the function of guiding direction. If there is no better individual within the dynamic decision-making domain of firefly , then firefly will update the position in the direction of firefly , thus greatly improving the solving rate of the firefly algorithm. If firefly is the optimal solution in the current iteration process, it will use a smaller step size for position transformation, to achieve the purpose of searching for solutions with higher precision.

3.2.2. Adaptive Step Size Moving Strategy

Since the step size in the firefly algorithm is a constant value, a nonglobal optimal solution and repeated oscillation may occur. To solve this problem, the adaptive step size is proposed to replace the fixed step size in the algorithm. In other words, in the early iterative process of calculation, the step size should be kept large to improve the solving speed and search for the global optimal solution. In the later iterative process, the step size is kept small to solve the oscillation problem and improve the accuracy of the solution. The step size of the early and late iteration process can be determined according to equations (6) and (7), respectively.where represents the adaptive step size; is a constant.

Figure 7 shows the specific steps of the BP neural network model optimized by the firefly algorithm.

3.3. Determination of Model Input Variables

The input data to the proposed model are determined based on relevant literature (Xie [23], Lin [24], Peng and Weng [25]) and historical data on the pure electric bus’s vehicle type, SOC value, battery age, and time period and bus real-time operation. The arrival time of pure electric buses is taken as the output target. The specific parameter settings are as follows. represents the operation time of the vehicle with the closest departure time from the first station of the line on the same day and the same time period in the historical week from the station to , which , respectively, represent the operation time data of the previous week, two weeks, and three weeks. represents vehicle type. According to different vehicle types, the vehicle type can be expressed as . 1 means a bus with a mass≥18000 kg, 2 means a bus with a mass of 7400 kg∼18000 kg, and 3 means a bus with a mass ≤7400 kg. represents SOC. According to different SOC values, SOC can be expressed as . 1 means that the SOC value is in the range of 0∼0.4, 2 means that the SOC value is in the range of 0.4∼0.6, and 3 means that the SOC value is in the range of 0.6∼1. represents the battery age. According to different battery ages, the battery age can be expressed as . 1 means the battery age is 0∼0.5 years, 2 means the battery age is 0.5∼1.5 years, 3 means the battery age is 1.5∼3 years, and 4 means that the battery age is more than 3 years. represents the time period. According to the different time periods, the time period can be expressed as . 1 means the peak time period of 7:00–9:00 and 17:00−19:00, and 2 means the off-peak time period of 9:00–17:00.

Input variable set , namely, the seven data need to predict the arrival time of the pure electric bus.

Set up training data vector . In the formula, , n is the number of training samples, and is the service time of a pure electric bus from the station to .

3.4. Input Data Processing

This paper uses the normalized function as formula (2):where represents the value after normalization; represents the number of groups of training data; represents the maximum value of training data; represents the minimum value of training data.

3.5. Selection of Activation Function

Figure 8 shows that the value ranges of the bipolar S-function and the S-function are different. The minimum value of the S-shaped function is 0, the maximum value is 1, and the value range is 0∼1. The minimum value of the bipolar S-function is −1, the maximum value is 1, and the value range is −1∼1. Because the bus travel time cannot be negative, this paper chooses the S-shape function as the activation function of the BP neural network.

S-type function:

3.6. Determination of the Number of Model Nodes

The number of nodes in the input layer is the number of influencing factors, so the number of input layers is 7. In this model, the operation time of the pure electric bus is the only output value of the output layer, so the number of nodes in the output layer is 1. The number of nodes in the hidden layer is determined as shown in the following formula:where represents the number of nodes in the hidden layer, represents the number of nodes in the input layer, represents the number of nodes in the output layer, and is a constant between 0 and 10.

Through repeated training, when the number of hidden layer nodes is 9, the prediction accuracy of the model is highest.

4. Example Analysis

4.1. Data Sources

By using the operation data of the Route 321 bus line on the Fuzhou bus transportation dispatching management platform, the pure electric bus arrival time prediction model and the algorithm constructed in the previous article were verified and analyzed. The total route length of the Fuzhou Route 321 bus is 24 km, starting from the Bus University City Town Terminus and ending at Yuandonglijing Station. The specific bus route is shown in Figure 9. The bus route is through Cangshan Wanda Plaza, Taijiang Pedestrian Street, the provincial skin hospital, and other large flow of people stations. This line is the main route for students in the university town area to take to the city. Therefore, it is of great reference significance to select the data of this line for verification.

4.2. Training Model

This paper collects 100 groups of sample data, 80 sets are used for model training, and the other 20 groups are used for verification. Before entering the data, the data must be normalized. Import the processed data into MATLAB for training. Table 13 shows part of the training data samples.

MATLAB software is used to write the code of the FA-BP prediction model. The important parameters of the model are set as follows: the population size is set to 50, the iterative number is set to 50, the global parameter is set to 2, and the local search capability is also set to 2.

After running the code on MATLAB, a graph of the fitness curve can be obtained. Figure 10 shows that the FA-BP model converges rapidly when the number of training iterations is 0–50. After the number of iterations reaches 10, the fitness curve changes gently, indicating that the model has good prediction performance after running more than 10 times. At about 35 times, the network converges and reaches optimal fitness. Compared with the traditional neural network model, the FA-BP model has a faster convergence speed and smaller convergence error.

4.3. Construction Model

Based on selecting the activation function as an S-type function (formula (9)), the firefly algorithm is used to optimize the weights and thresholds in the BP neural network, so that the four input variables of vehicle type, SOC value, battery age, and time period are transmitted to the output layer through formulas (2)–(4), thus building the bus running time prediction model, as shown in formula (4). Parameters of the FA-BP prediction model , , , are shown in Table 14, and the specific meanings of the parameters are shown in 3.1.

4.4. Model Verification

Using the Kalman filter model, BP neural network model and FA-BP model to predict 20 groups of normalized sample data. The comparison and prediction error between the predicted output value and the actual value are shown in Figures 1113, where the prediction error is the predicted travel time of pure electric buses minus the actual value.

From Figure 11, the maximum prediction error of the Kalman filter model is −0.4899 min, which is −29.4 s, and the minimum prediction error is −0.0093 min, which is −0.56 s. From Figure 12, the maximum prediction error of the BP neural network model is −0.0888 min, which is −5.3 s, and the minimum prediction error is −0.0177 min, which is −1.1 s. From Figure 13, the maximum prediction error of the FA-BP model is −0.0998 min, which is −6.0 s, and the minimum prediction error is 0.0010 min, which is 0.01 s. Therefore, it can be preliminarily determined that among the three prediction models, the FA-BP model has the highest accuracy in predicting the arrival time of pure electric buses.

The root means the square error is also called the standard error, which can reflect the pros and cons of the overall prediction effect of the model. The calculation method is shown in the following formula:where is the root mean square error; is the actual value; is the predicted value.

The larger the , the greater the error between the actual value and the predicted value, and the lower the prediction accuracy of the model. On the contrary, the higher the prediction accuracy. of the three models can be calculated by formula (5). Table 15 shows that the maximum of the Kalman filter model is 0.351, indicating that the Kalman filter model has the lowest prediction accuracy. The minimum of the FA-BP model is 0.04, indicating that the FA-BP model has the highest prediction accuracy.

In terms of running time performance, the running time of the Kalman filter model, BP neural network model, and FA-BP model was 12.76 s, 9.14 s, and 139.71 s, respectively. The running time of the Kalman filter model and BP neural network model is very short, but the Kalman filter model needs more historical data on bus arrival time to ensure the accuracy of the prediction model. The FA-BP model takes more time to optimize the firefly algorithm, so the total time is longer than the traditional BP neural network model, but it can improve the prediction accuracy, so it is suitable for bus arrival prediction applications that need more accurate prediction accuracy.

5. Discussion

The accuracy of bus arrival time is related to bus service level and operation reliability. This paper proposes an improved BP neural network algorithm for pure electric buses. The BP neural network is optimized by using the guided moving strategy and the adaptive step size moving strategy, and the global search and convergence ability of the BP neural network are improved, so that the BP neural network has a better prediction effect on the arrival time of pure electric buses.

Taking bus Route 321 in Fuzhou as an example, through the analysis of influencing factors on the arrival time of pure electric buses, the vehicle type, SOC value, battery age, and time period are selected as input variables. Pure electric bus operation data are used to train and validate the model, and compare it with the prediction results of the Kalman filter and BP neural network. Results show that the of the Kalman filter model is 0.351, the of the BP neural network model is 0.059, and the of the FA-BP model is 0.04, indicating that the model built-in this paper effectively improves the prediction accuracy and has good reliability and feasibility.

Data verification shows that the model established in this paper has higher prediction accuracy. Due to the limitation of research time, research conditions, and writer’s level, there are still some problems in the paper that need further research and improvement. FA-BP prediction model ignores the influence of road conditions and the environment. However, abnormal operating environments such as emergencies and road construction will affect the prediction accuracy of bus arrival time. At present, there are few kinds of research on this aspect. In addition, the running time of the FA-BP model is longer than that of the BP neural network model, so it needs to be selected based on actual application situations. In the future, the study can summarize the categories of road emergencies, further study the impact of emergencies on bus travel time, and consider further optimizing algorithms to reduce the overall running time of the model.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The authors thank the Fuzhou Public Transport Group Co., Ltd. for providing some of the required data. The authors would like to thank the teachers and graduate students of the traffic engineering department at Fuzhou University for their help.