Abstract
With the rapid development of high-speed railways, the continuous improvement of the road network, and the leapfrog increase in demand for cross-line passenger travel, the connectivity between cities is also increasing. The network structure has gradually evolved from a single line structure to a complex network structure, and the routes connecting city nodes are no longer unique. At this stage, there are multiple effective routes between destinations on the network, and the choice of train routes has diversified. Different options for interline trains have different degrees of impact on line and station operations. Therefore, this study constructed a deep learning CNN-GRU combined model to predict the traffic speed of railway logistics. The model used a convolutional neural network (CNN) to extract the spatial characteristics of speed data and GRU to extract the time characteristics of speed data. The experiment proved that the prediction accuracy of the combined model was better than that of the single GRU and CNN models. Finally, a multi-objective optimization model with the shortest total train running distance, the shortest total train running time, and the least influence on road sections is constructed using the prediction results.
1. Introduction
The increase in the number of lines on the high-speed railway network has led to increased accessibility between cities [1]. Faced with such a complex road network structure and the distribution characteristics of urban nodes, passenger flows have to crossover to different high-speed railway lines in order to achieve direct access between these cities [2]. The large proportion of cross-line logistics is one of the most important characteristics of railway logistics transport, and this proportion will continue to increase with the development of high-speed railways [3]. In such a huge cross-line passenger transport demand, the operation of high-speed railway cross-line trains is of great significance and importance to the organization of high-speed railway transport, especially in meeting the demand for cross-line passenger travel [4]. Our paper used machine learning (ML) as its main part; i.e., it is the study of computer algorithm that can improve automatically through experience and by the use of data. In this study, ML is used to understand the colony algorithm comprehensively.
In addition, with the advent of the low-carbon era and the active promotion of relevant national policies, the existing transport structure will be greatly adjusted, and more bulk goods will be transported by rail [5]. A large number of goods will be transported to various places through railway logistics, and optimizing railway transport is one of the important means to improve the competitiveness of goods in a huge commodity market. Railway enterprises can be based on the optimization of transport structure and effectively improve the efficiency of logistics transport and reduce the cost of transport in the logistics industry, while promoting the development and growth of logistics transport enterprises, but also for the establishment of a good logistics transport market atmosphere to provide support to accelerate the rapid development of various logistics industry enterprises. In the rapid development of the market situation, in the face of increasingly fierce competition in the transport market, if not solve the problem of railway cargo collection, railway logistics may appear loss phenomenon, and thus, railway transport enterprises must have to speed up the solution to the problem of difficult cargo collection in railway cargo transport, solve their own transport shortcomings, and improve their own attractiveness, to better expand the business market and promote the benign development of enterprises.
The sequential framework of study is given as follows:
Section 1 introduces the study and describes the main contribution of study. Section 2 contains the study of related work. Section 3 describes the methods used in the study to perform experiments. Section 3 gathers data and implies experiments on it. Section 5 concludes the study.
2. Related Works
In this section, the work related to our research is discussed completely. Firstly, the current status of transport optimization algorithm research is discussed. The artificial research methods and operation research methods are discussed thoroughly.
2.1. Current Status of Transport Optimization Algorithm Research
The vehicle path problem is recognized as a classical global problem. The rail transport path problem is complex in its influences and difficult to handle, and there is a growing body of research related to it. Research related to train operation adjustment focuses on adjustment methods, which in practical application commonly include operational research methods, artificial intelligence methods, and simulation methods.
2.1.1. Operations Research Methods
In the process of path optimization, a mathematical model needs to be established after certain conversions and then mathematically solved through software tools. There are already many train operation adjustment models based on this approach, and they have been used with great effect in practice.
Szpigel made a specific analysis of single-track railway in this aspect, based on which he established a concept of “optimal train scheduling” and then applied the branch-and-bound algorithm to solve the train operation adjustment model on the basis of mathematical analysis. The results obtained have a high reference value [6]. Saud makes a comparative analysis of the existing train operation adjustment and optimization technology from a mathematical perspective. After abstracting and summarizing the constraints of single-track railway, it analyzes and lists all possible paths through the branch-and-bound method [7]. Tornquist studies from the perspective of train scheduling, makes a comprehensive analysis of relevant influencing factors, and then solves the problem of train operation adjustment through a mathematical model, which provides support for the research in this aspect [8]. COrnlari established a double-layer planning model for train operation adjustment after proper simplification and then investigated and understood the transport conditions. Based on this, some boundary conditions were added to each region, and then, a local feasible solution was given based on these constraints by computer software [9]. Castillo specifically analyzes the characteristics and requirements of the current single-track and dual-track mixed railway network and then replaces the departure time with the expected departure time and determines the maximum value of the objective function after processing the segmentation method [10]. In this study, Brannlund applied the Lagrange method to solve the problem of train operation adjustment and gave corresponding solution results based on the analysis of the station’s capacity limitation [11]. Araya introduced 0-1 mixed integer programming method to solve the problem, and after proper preprocessing, B&B method was used to obtain the optimal solution. According to the practical application results, the advantage of this method is that it can efficiently conduct online scheduling analysis under certain interference conditions [12]. Ariano discussed the problem of network train operation adjustment in detail and established the corresponding model after the appropriate transformation of the problem [13]. This model does not need to consider the inventory constraint in the process of processing, and it is adjusted and optimized according to the minimum offset of the operation chart, and then, the model is solved by the branch-and-bound method. Overall analysis shows that this adjustment problem has combinatorial characteristics, so to avoid the problems of poor real-time performance and low efficiency of traditional methods, some scholars introduced heuristic algorithms to solve the problem and achieved good results.
Jovanovic introduced “heuristic acceleration algorithm based on lower bounds” in train running path adjustment, whose objective function is the total delay cost. This algorithm mainly selects heuristic search function to search and solve in the process, which can well meet the requirements related to efficiency [14]. Sahlri dealt with the problem of resolving conflicts between trains and established a heuristic algorithm for path adjustment on this basis, which could meet the final scheduling requirements according to the optimization of crossing plans [15].
2.1.2. Artificial Intelligence Methods
This method can be divided according to its nature into artificial neural networks, expert system methods, fuzzy decision-making methods, and several others, each of which has certain advantages, disadvantages, and applicability.
The expert system method is based on the inference mechanism, and the scheduling optimization is based on the inference results. Expertise is presented in if-then form and programmed accordingly. Kataoka established a DLAPLAN system in the 1990s, which applied an expert system for path planning and analysis and had high applicability [16].
The characteristic of fuzzy expert system method is the organic combination of fuzzy analysis and expert system. Due to the differences in experts’ recognition of different rules in the specific application process, fuzzy reasoning technology needs to be introduced to better meet the accuracy requirements of system processing results. In Dundar’s research on this aspect, the combinatorial optimization method was comprehensively applied to deal with it, thus significantly reducing the difficulty of dealing with scheduling problems [17]. In the process of establishing the scheduling model, the neural network is selected to optimize the scheduling knowledge. Malavasi specifically analyzed the road network conditions in Italy and then studied the problem of train operation adjustment under certain path constraints. After transforming the problem, he analyzed it through the neural network model and established an intelligent simulation system. It has a high application value in the simulation of train operation adjustment [18]. Ho analyzed the real-time train operation conflict problem and then conducted a comparative analysis on the processing requirements and known conditions of this problem through the genetic algorithm, simulated annealing, and Tabu search and obtained valuable results [19]. In this study, Semet specifically analyzed the problem of train schedule adjustment under disturbance conditions, set the adjustment goal as the minimum total train delay time, and then comprehensively used permutation-based evolutionary algorithm to solve this problem, and the results have a certain application value [20]. In his research, Tazoniero analyzed the entire railway network, appropriately simplified the fuzzy control knowledge system established after processing, and then worked out an appropriate operation plan based on this system [21].
2.1.3. Simulation Method
In this aspect, Vansteen introduced mathematical methods to establish a standard linear programming model and set the objective function as generalized waiting cost. Simulation analysis was carried out by software, and the results showed that the reliability of this method met the requirements [22]. Shinkansen train operation simulation system for train operation adjustment in the process of application of the man-machine interaction pattern, on the train running disturbance problems related to processing, set up different optimal portfolio strategy, in solving mathematical model and the application of computer simulation method to solve the optimal scheme, and can basically meet the real-time requirements [23] in the path optimization. In this study, Dorfrnan chose the discrete event system method for processing and proposed a two-track section train model, whose objective function is the total running time of the train [24]. Corman also carried out a similar study, which optimized the combination by adding or removing trains and changing the train running path and adjusted the train running mode accordingly [25].
2.2. Railway Cargo City Logistics
The railway freight station is a station that chiefly deals with the current freight business, and freight operations need to be carried out through the railway freight station. Because of the large volume of railway transportation business, the operation time is longer and the noise is bigger, so it is necessary to physique the cargo station according to the distribution and configuration mode of the city receiving points. Cargo station can realize the following functions in the operation process: freight production organization and management, that is to send and transfer corresponding cargoes, carry out certain reloading and transportation with other modes of transportation, realize the purpose of corresponding loading and unloading, and distribution, and also provide support for cargo stowing. To determine a scientific and reasonable cargo transportation plan, cargo stations need to arrange freight trains according to the amount of cargo and cargo transportation time, to meet the needs of railway freight. Source organization and management: in the process of freight transportation, and the freight station determines the information related to the source of goods through the organization and management of freight production in the daily operation process and determines the appropriate freight mode to achieve the reasonable transportation purpose of goods after analysis and processing. To participate in freight market management, it needs to unite with other related parties and organically engage cargo owners, freight forwarders, and road management departments, to better carry out collaborative management, effectively meet the needs of cargo owners and participants, and promote the optimization of logistics organization.
A railway receiving station is an extension of large cargo station, which is a logistics service center mainly dealing with freight business and is the terminal for receiving and delivering goods. If there is no receiving point, the customer needs to deliver the goods to the railway cargo station by himself, as shown in Figure 1. In this case, the distribution line is too long and crossed, which leads to low efficiency and low benefit, and a large number of resources are consumed in the transportation process. However, if the delivery point is set and unified distribution is carried out by the delivery point, the above disadvantages can be greatly reduced, and the efficiency of urban logistics and transportation can be improved, as shown in Figure 2.


3. Methods
The speed of railway logistics is affected by many factors, including the speed of upstream and downstream of the road, the speed of historical period, and the cyclical speed. Usually, one method can only mine one type of influencing factor. Combining the algorithms of mining different influencing factors can better select the influencing factors and improve the prediction accuracy. With the continuous development of deep learning, traffic speed prediction is also gradually introducing deep learning to explore influencing factors. At present, mainstream deep learning algorithms include CNN, RNN, and LSTM.
3.1. Basic Principles of CNN
Generally, the basic structure of CNN consists of two layers [26]. The first layer is the feature extraction layer, and the input of each neuron is connected to the local acceptance domain of the upper layer, and the features of each local domain are directly extracted. Once the feature extractor gives a new local location feature, its local location correspondence with other types of features has been determined. The other layer is the feature mapping layer. Each feature computing layer of the network is composed of feature mapping layers of multiple graphs. Each graph is a feature plane in the feature mapping graph, and all feature neural elements in the feature plane network have an equal feature weight. A sigmoid function can be used as an activation function of the convolutional mapping network, which makes the mapping network based on feature set have displacement invariance.
A convolutional neural network usually includes input layer, convolutional layer, pooling layer, and full connection layer, as shown in Figure 3. The key reason for the success of CNN lies in local connections and shared weights. On the one hand, the reduction in weights helps to realize network optimization and, on the other hand, reduces the risk of model overfitting.

In Figure 3, the structure of CNN is described completely. Firstly, input is inserted into it, and then, it consists of three layers of convolution, and after this, a sample is taken from data and processed and then output is received.
3.2. Principle of GRU Algorithm
With the diversity of relationships between data, the traditional neural network has insufficient ability to capture the relationships between time-series data, so the effect of data prediction application also needs to be improved. Recurrent neural network (RNN), a class of neural network specifically designed to process temporal data samples, includes three layers: input layer, hidden layer, and output layer, and each layer not only outputs to the next layer, but also outputs a hidden state, where the hidden layer is able to preserve the historical information, and the specific structure is shown in Figure 4.

The above Figure 4 accurately conversed about the RNN structure inside. In Figure 4, the RNN is folded on the left and unfolded on the right. The arrow next to the on the left represents the “loop” in this structure that is reflected in the hidden layer. As can be seen in the unfolded structure, the neurons in the hidden layers of the RNN structure are also entitled to each other. This means that as the sequence progresses, the previous hidden layers will affect the later hidden layers. After a long period of application and development, in the process of model and data selection, RNNs can lose the ability to learn to ability to connect information from far away; i.e., RNNs suffer from long-term dependency in practice.
The LSTM is a special type of RNN model that addresses the problem of long-term dependency. The LSTM is deliberately designed to avoid the problem of long-term dependency by the presence of “cell states,” which are analogous to conveyor belts, where memory cells run along the entire chain with only a few linear interactions, and information is deliberately kept constant and transmitted backwards.
The gated recurrent unit (GRU) [27], a variation of the LSTM, was created to overcome the shortcomings of RNNs that cannot handle long-range dependencies. The GRU combines the forgetting and input gates of the LSTM into a single gate, the update gate, to maintain the effect of the LSTM while making the structure simpler; i.e., the GRU consists mainly of an update gate and a reset gate, as shown in Figure 5. This has the advantage of achieving similar results to the LSTM with fewer parameters, less costly training, and faster training.

3.3. CNN-GRU-Based Model for Railway Traffic Speed Prediction
In different forecasting problems such as traffic travel, the complexity of the data, and the complexity of the influencing factors, needs to be analyzed before a forecasting model can be built to build a forecasting model based on the laws of change in the data. Different forecasting models can be built for the same problem from different perspectives. Different forecasting models have different inputs to the data, and the effective information extracted will be different, and the patterns that can be captured will be different. The combination model has the advantage of being able to capture the impact of all the factors that influence demand, as the traffic speed of rail transport is significantly correlated in time and space, and there are many factors that influence demand. Combined models can provide better predictive power by combining the data capture capabilities of different models.
The analysis of the correlation between rail transport speed found a high correlation between road speed and previous moments; i.e., the influence of the time factor is obvious, so the GRU model, which can capture the temporal characteristics of the data, was chosen for prediction. After analysis, it was found that there were also some spatial features among the correlation influencing factors, such as the relationship between the upstream and downstream of the road. To improve the prediction accuracy, this study uses CNN to extract the spatial features of the data, while the GRU model extracts the temporal features; i.e., the results of the two models are fused through the fully connected layer to build a combined model to improve the prediction accuracy. The model uses the power of CNN deep learning model for spatial-temporal work on the one hand. At the same time, the GRU model is used to capture the temporal features of the model to improve the accuracy of velocity prediction.
The specific process of the model is shown in Figure 6, which is divided into the following steps:(i)Step 1: divide the data into training set and test set and complete the normalization process.(ii)Step 2: model training. The data are used as input to the model. The CNN module is used to capture the features of local trends, with the input being the normalized dataset and the output being the train speed at the next moment in the future. Meanwhile, the GRU module is used to capture the long-term time dependence, with the prediction target being the traffic speed value at the next moment in time. After processing by the CNN and LSTM modules, the outputs of the two modules are fused in the merge layer of the feature fusion module to produce the final prediction after the fully connected layer.(iii)Step 3: denormalize data to obtain the output of the combined model.

4. Experimental Results and Analysis
In this portion of paper, the experimental results are described and analyzed effectively. Firstly, the data are prepared, and then, the GURU-CNN model is implemented and analyzed. Finally, the railway route optimization is discussed.
4.1. Data Preparation
The model will first model the predicted speed of daily rail traffic. After data mining and analysis, roads and trajectories are connected through map matching, 1 hour is divided into 1 time period, the speed of each time period is found, and the correlation of influencing factors is analyzed. The training set was divided into 12 time periods, training set: validation set: test set = 7 : 1 : 2, where the time data in the training set were selected from November 1 to November 29, and the time range of the test set was set to the speed data of the 30th day.
Data normalization process: the resultant value is mapped between [0-1] by a linear transformation of the original data. The conversion function is as follows:where max is the maximum value of the sample data and min is the minimum value of the sample data.
4.2. Implementation and Validation of a Combined CNN-GRU Model
Since the combined model uses a CNN to extract convolutional features of the data and a GRU model to extract spatial, therefore, the combinatorial model consists of three main aspects, data input, model training, and model validation.
4.2.1. Model Input
After the above analysis and referring to the input results of the GRU model, the input of the combined model is still the speed and flow rate of the first three time slices, the upstream and downstream roads, and the speed at the far historical moment of the same time period of the working day, with a total of 31 variables.
4.2.2. Model Training
The number of convolutional kernels and step size is input into the CNN model construction, and the model will automatically generate convolutional kernels and then extract features. Based on the construction of the separate CNN model above, the CNN in the combined model is set up with 2 convolutional layers. It should be noted that after convolution + pooling, a dropout operation is introduced in the combined model to randomly remove some neurons from the neural network in order to avoid overfitting due to the large number of neurons. For the GRU model, the batch size is set to 128 based on empirical values. The activation function is the function. is set to 60, and is set to 0.01. The training set fit graph is shown in Figure 7. From the fit of the training set of the model in Figure 7, the fit between the training set and the true values is relatively stable and can be used for the prediction of the prediction set.

4.2.3. Model Validation
The trained model was validated with a test set, and Figure 8 shows the comparison between the predicted and true values of a selected road. It can be seen that there is some error in the fit of the model, but the combined model has some improvement over the single GRU model and the CNN model.

4.3. Railway Logistics Route Optimization
4.3.1. Model Construction
In general, route optimization problems focus on solving the traveling salesman problem (TSP), where a traveler is assumed to visit n cities to choose the path to take, visiting each city only once and finally returning to the starting point. The optimization objective of the path for this problem is to minimize the distance between the start and end paths. The route optimization designed in this study then takes into account the needs of rail logistics, not only in terms of distance requirements but also in terms of travel time, i.e., satisfying the minimum model of costs incurred by the traveler in the event that travel distance and travel time are achieved. The calculation model is shown in the following equations (2)–(6).
With the following conditions,where represents the cost incurred for one kilometer traveled, represents the penalty cost, represents the length of the road , the speed at different times, represents the time period traveled, is the road number, is the total number of roads, is the time traveled, and is the specified optimization time period.
4.3.2. The Basic Ant Colony Algorithm
The main steps of the ant colony algorithm in solving the route optimization process are as follows:(i)Step 1: initialize the algorithm parameters, including the number of ants, pheromone factor, heuristic function factor, pheromone volatility factor, pheromone constant, and maximum number of iterations, read the data into the program, and preprocess it(ii)Step 2: place the ants randomly to different starting points, calculate the transfer probability of the next city visited by each ant, and complete the transfer until the ants have visited all cities(iii)Step 3: calculate the length of the path passed by each ant, record the optimal solution for the current number of iterations, and update the pheromone concentration on the path at the same time(iv)Step 4: determine whether the algorithm has reached the maximum number of iterations; if not, return to Step 2; if yes, finish(v)Step 5: output the result and, if required, the relevant metrics of the optimization search process, such as running time, running path, and iteration cost
The calculation of the transfer probability and the updating of the pheromone are important steps in the execution of the ant colony algorithm. The transfer probability, which is the probability of an ant moving from the current city to the next city, is expressed as follows:where denotes the city that ant is allowed to choose next; is the information heuristic factor, the higher the value, the more ants tend to choose this path; is the expectation enlightenment factor, which denotes the importance of heuristic information in the path chosen by ants; and is the heuristic function, which denotes the expectation level of ants moving from city to city .
The pheromone carried by the ant after it has circumnavigated all the cities needs to be updated, as shown as follows:where denotes the pheromone volatilization coefficient, which generally takes a value in the range of (0,1), and represents the increment of pheromone in the pathway during the current cycle.
4.3.3. Improved Ant Colony Algorithm
For the improvement strategy of the ant colony algorithm, this study will improve both the transfer probability and the pheromone update of the ant colony algorithm, taking into account the data characteristics.
(1) Near-Neighbor Strategy. In general, when ants choose the next road in an ant colony algorithm, they choose the road with a higher probability of state transfer. However, when there are many roads, the next road may not be the closest road to road , and the speed of calculating the transfer probability will be very slow. Based on such phenomena, this study designs the nearest neighbor node selection, which only calculates the transfer probability between the roads that meet the rules, to improve the convergence speed of the algorithm, taking into account the fact that the selectable roads in the dynamic TSP problem change with time.
The nearest neighbor node selection rule is designed in this study: from the initial point to the final point, the roads are numbered and ordered to form a matrix, and the distance between two roads in the matrix is calculated separately. In the case of selecting more roads in the state transfer process, it is sufficient to specify a definite number of roads to select the nearest ones, which can satisfy the dynamic selection and reduce the calculation of state transfer probability.
(2) Pheromone Update Strategy. In this study, the improvement of the pheromone of the ant colony algorithm is reflected in the inclusion of time and distance into the pheromone change criterion. The train speed for a certain time period for a certain road obtained by machine learning methods in the above paper is used as the update condition of the pheromone, based on which the real elapsed time of the ants on a certain road is obtained. The update of the pheromone is dynamic according to time. The pheromone update rule is shown as follows:where . , represents the distance of the road and the time of passing the road, respectively. refers to their respective weights in the heuristic function.
4.3.4. Experimental Results
For the application of the improved ant colony algorithm, to test the optimization results of the model, some of the roads are selected for validation in this study in conjunction with the information present in the road network. The road network is abstracted into a simplified structural diagram, as shown in Figure 9. The direction of the network consists of three sequentially adjacent stations, , , and . The direction consists of three sequentially adjacent stations, , , and .

The iterations obtained from the two models are shown in Figures 10 and 11, respectively, which show that the model stabilizes at around 300 iterations for the traditional ant colony algorithm and 100 iterations for the improved ant colony algorithm. The model stabilizes at around 300 iterations, while the improved ant colony algorithm stabilizes at around 100 iterations, with a 33% improvement in iteration efficiency.


Remark 1. In this section, the experiments are done with accuracy and two colony algorithms are obtained, one is general colony algorithm and the second is an improved colony algorithm. Both algorithms have a different number of iterations to work effectively.
5. Conclusion
With the rapid development of China’s high-speed railway, the road network is constantly improved and the demand for railway logistics is increasing, and the operation of cross-line railway plays an important role in the logistics transport organization of China’s high-speed railway. The complexity of the line structure of the road network makes the paths between stations no longer unique, and the selection of train running paths presents a diversified development trend, so the research in this study has research value and practical significance for the optimization of cross-line train running paths. This study reconnoiters route optimization on the basis of railway traffic speed prediction, firstly corresponding to the road network and railway travel paths to calculate the speed of each road at different times of the day. Based on the analysis of the influencing factors, a speed prediction model for railway logistics relying on machine learning algorithms is constructed, and an ant colony algorithm is designed to correlate the results of speed prediction with route optimization.
Data Availability
The datasets used during this study are available from the corresponding author on reasonable request.
Conflicts of Interest
The author declares that he has no conflicts of interest.