Abstract

Accurate forecasting of subway passenger flows is considered essential for the development of efficient train schedules. However, transport capacity constraints as well as station congestion can be caused by unexpected concerns with trains or power supply, which endanger passenger safety. Predicting passenger flows at the time of a fault is particularly challenging due to the low probability of failure and the complexity of the factors involved. In addition, deviation from the observed value may be resulted by the point-in-time prediction of passenger flow, thus affecting the efficiency of passenger flow control measures. To address this concern, a three-stage A-LSTM prediction model utilizing an attention mechanism and a double-layer LSTM (Long Short-Term Memory) neural network has been proposed. The model is used to map the impact of fault events on subway transport capacity with respect to delays onto the inbound passenger flow. By analyzing the data from the subway system in a metropolitan city of China, the range of passenger flow fluctuations in 10-minute intervals will be precisely predicted and applied to different subway stations.

1. Introduction

Subway systems around the world are expanding, and they mainly serve the densely populated areas of the central city. Subway operators have developed fixed strategies and mechanisms to predict passenger flow to manage large number of passengers. However, sudden events can cause an instantaneous change in inbound passenger flow at a single station, which can have a ripple effect throughout the entire network. If flow control measures are not applied properly, it can lead to serious accidents. In 2021, the City S, a metropolitan city in Eastern China, subway system, with the longest operating mileage in the world, faced 2,384 incidents, 78 of which caused direct subway delays, resulting in a maximum delay of 106 minutes. Therefore, it is necessary to accurately predict the trend and magnitude of passenger flow changes during failure events. This can help subway operation managers select appropriate flow control measures. The forecast results should be detailed and show the passenger flow’s vibration in intervals as short as 10 minutes. In recent years, the development of artificial intelligence algorithms has made subway passenger flow prediction more accurate and faster.

When it comes to predicting passenger flow in subways, most research has focused on normal scenarios, resulting in high accuracy. However, predicting passenger flow during failure conditions—such as bad weather, holidays, or large-scale events—can prove more challenging. The combination of these factors results in lower passenger flow prediction accuracy. Moreover, sudden failures that significantly impact the subway system are rare and occur randomly, making it difficult to accumulate sufficient data to train and make accurate predictions for specific stations or sections. This is one of the primary reasons why there are few studies on predicting passenger flow under failure conditions.

The subway system’s failures have a significant impact on train transport capacity, leading to delays. Despite passenger flow forecasting research, the failures’ impacts on train transport capacity are not fully analyzed, and accurate prediction of inbound passenger flow fluctuation patterns during different types of failure events is not possible. Current research methods primarily focus on predicting the exact number of passengers in the future without considering the possible range of fluctuation in passenger flow. However, predicting the fluctuation range is more useful than a single number as it accounts for extreme situations. Unfortunately, inbound passenger flow prediction accuracy under failure conditions is currently low, with little emphasis on predicting the fluctuation range. This paper aims to explore a passenger flow forecasting method for failure events, assisting station managers in taking passenger flow control and evacuation measures in advance.

The research focuses on accurately predicting short-term passenger flow fluctuations during unexpected failures in urban rail transit stations. Key innovations include the following:(1)Introducing features of train transport capacities and delays during failure events into the prediction model, improving accuracy.(2)Developing a three-stage LSTM prediction model with an attention mechanism, enhancing short-term passenger flow forecasting during failure events.(3)Developing an A-LSTM model that has been verified with high accuracy and can be applied to real-world passenger flow control in subway stations.

This paper is divided into several sections. Firstly, we summarize the previous research related to this field. Next, we analyze the practical problems of this research and provide a detailed description. Then, we propose an A-LSTM model that utilizes 30 day rolling update data to predict inbound passenger flow. We verify the effectiveness of this method using real events and passenger flow data from the City S subway. In the sixth part, we apply the A-LSTM model to predict short-term inbound passenger flow in normal and sudden failure scenarios. Finally, we conclude with our findings and recommendations.

Currently, most research on short-term passenger flow prediction for subway stations focuses on normal scenarios. The methods used are predominantly based on neural network algorithms and signal mode decomposition, with few studies conducted on failure events. This paper provides an overview of research methods for short-term passenger flow prediction at subway stations under normal conditions and highlights research on passenger flow prediction under failure conditions.

2.1. Passenger Flow Prediction under Normal Conditions

The short-term prediction of passenger flow at subway stations is heavily influenced by the station’s location, surroundings, and passenger flow in the previous period, making it suitable for the time series method. In recent years, different neural network algorithms have been used to predict short-term passenger flow in subway stations for normal operation. For example, Liu et al. [1] presented an end-to-end deep learning architecture, termed Deep Passenger Flow (DeepPF), to forecast the subway inbound/outbound passenger flow. They have combined recurrent neural networks and long short-term memory to deal with modeling external environmental factors, temporal dependencies, spatial characteristics, and subway operational properties. And the neural network architecture could be layer-specific which means the different layers can take different neural network models. Fu et al. [2] brought the external information forward. They came up with a new methodology that presents a neural network model for 20 minutes ahead prediction of subway passenger flow based on multiple sources of data including smart card data, mobile phone data, and subway network data. The proposed neural network structure includes fully connected layers and long short-term memory layers. Ma et al. [3] proposed a prediction model of parallel architecture of convolutional neural network and bidirectional LSTM network. The spatial characteristics of passenger flow are extracted by the convolutional neural network, and the temporal features of passenger flow are extracted by the bidirectional LSTM network. It is able to predict the passenger flow in the next 10 minutes based on the historical passenger flow data of the past 40 minutes. Hao et al. [4] embedded a sequence-to-sequence model with the attention mechanism to predict 15–60 minutes of patronage. The attention mechanism is applied to take external features such as time of day and day of week into consideration. This approach has improved the prediction accuracy. In addition, various neural networks have been widely used in the field of passenger flow forecasting. Moreover, a variety of neural network architectures have been extensively employed in the realm of passenger flow prediction. For instance, Liu et al. [5] utilized a recurrent neural network, Gong et al. [6] implemented an online latent space strategy, Yang et al. [7] devised an attention-based neural network, Lin and Tian [8] introduced a model combining random forest and long short-term memory networks, Wang et al. [9] and Yang et al. [10] proposed a dynamic spatial‐temporal hypergraph neural network, Huang et al. [11] employed a backpropagation neural network, Xiu et al. [12] developed a multidisturbance spatial-temporal causal convolution network, and Wang et al. [13] introduced the Multitask Hypergraph Convolutional Neural Network.

Simulating passenger flow distribution using signal waves is a valuable technique for managing fluctuations in short-term passenger activity. Signal mode decomposition, in conjunction with neural network algorithms, is another effective approach for forecasting short-term passenger flow and overcoming the difficulties associated with accurately predicting passenger flow volatility. Wei and Chen [14] developed a hybrid EMD-BPN forecasting approach for 15-minute intervals which combines empirical mode decomposition (EMD) and backpropagation neural networks (BPN) to forecast the short-term patronage in subway systems under normal operating conditions. Xiu et al. [15] designed a three-stage framework to eliminate noise and enhance 15-minute interval patronage prediction. Firstly, in the preprocessing stage, the Ensemble Empirical Mode Decomposition (EEMD) algorithm adaptively decomposes the nonlinear and nonsteady passenger flow signal into several subsignals. Secondly, in the feature recognition and extraction stage, knowledge of the transportation field and statistical theories are applied to analyze and extract the critical decomposed components. Thirdly, in the prediction stage, the stacked Bidirectional Gate Recurrent Unit (BiGRU) is proposed to learn and extract information from the input features in both directions and use a multistep prediction to output the final prediction result. Wei et al., [16]; Zhao et al., [17]; Wei and Chen [14]; and Xiu et al. [12] have all applied this approach to short-term patronage prediction.

Besides these methodologies related to the neural network and signal mode decompensation, there are other algorithms such as the fuzzy logic method [18], Physical-Virtual Collaboration Modeling [19], time series analysis [20], automatic detection algorithm [21], improved gravity model [22], support vector machine model [23], and self-organizing data mining [24].

2.2. Passenger Flow Prediction under Abnormal Conditions

When it comes to forecasting subway passenger flow, neural networks are highly versatile and effective, while signal mode decomposition is particularly precise in specific scenarios. Recent studies have presented multilayer deep learning frameworks that consider both internal and external factors to elevate prediction accuracy and minimize time intervals. Based on current research, most approaches can anticipate passenger flow at 15-minute intervals under regular subway operations, with select models capable of predicting 10 minutes ahead. However, there is limited research on short-term forecasts of passenger flow during subway failures. Existing methods in this field typically involve inputting social media data [25, 26] into their models to predict passenger flow during special events, such as sports games and concerts. Xue et al. [25] applied social media data with smart card data to a multivariate disturbance-based hybrid deep neural network. Through the event information extraction and oversampling technology, Zhao [27] used the SMOTE (Synthetic Minority Oversampling Technique) algorithm to increase the passenger flow simple data of the sudden failure scene and then predicted the passenger flow through the time series model. This method can predict the passenger flow of subway stations under normal conditions and emergencies at 15-minute intervals. However, the impact of emergencies on train transport capacity is not considered, and the structure of the time series model is relatively simple, which fails to further improve the prediction accuracy and shorten the prediction time.

Past research has been inadequate in assessing the impact of failure events on train transport capacity and delays, as well as short-term variations in inbound passenger flow. Therefore, it is crucial for subway operators to develop an accurate forecast model for station inbound passenger flow during various failure events, taking into consideration the dynamic nature of the factors that contribute to these occurrences.

3. Description of the Problem

When a sudden subway failure occurs, it goes through four stages: the fault event, passenger congestion, emergency response, and recovery. Different types of failures affect transport capacity, delay times, and inbound passenger flow differently. The complexity of the cause leads to a rapid fluctuation in passenger flow. To predict the inbound passenger flow during a failure event, we need to quantify the impact on transport capacity and resulting delays and consider temporal and location parameters. An algorithm can be proposed to map these features to predict future inbound passenger flow.

The aim of this study is to estimate the range of incoming passenger traffic for future time points using past data on incoming passengers and related factors. The prediction is made using the formula, , with a prediction time interval of 10 minutes. The historical data on incoming passengers are obtained from the subway station’s inbound gate and are represented as . The input parameters are date (1–31), time (5–24, excluding nonoperating hours), peak hours (1 for yes, 0 for no), weekends or statutory holidays (1 for yes, 0 for no), number of connecting lines at the station n, and maximum and minimum values of transport capacities and delays for certain failure events, and .

This study assumes that all subway stations can handle any amount of passenger flow, regardless of their capacity. The study takes into consideration various input parameters such as temporal features (time, week, peak hours, and weekends), as well as fixed features of each station like transfer stations. Additionally, the study analyzes historical data to determine the transport capacities and delays under failure conditions. When a failure occurs, the corresponding failure type is manually selected and used as input parameters. The output of the study is a matrix, denoted as which indicates the upper and lower bounds of passenger flow fluctuations in the next 10-minute intervals. The study aims to provide subway operation managers with a reference to implement passenger flow control strategies.

4. A-LSTM Prediction Model

4.1. Model Description

Passenger flow data in a subway system are sequential, evolving over time. LSTMs, designed for effective sequence pattern learning, overcome the vanishing gradient problem of traditional RNNs. Their memory cell architecture enables selective information retention, crucial for modeling subway systems where historical data impact future passenger flow. LSTMs adapt to time lags and irregularities during delays, making them robust for dynamic subway system modeling. They automatically learn features, reducing the need for manual engineering in complex datasets. LSTMs’ flexibility handles various input types, like historical data, train capacities, and delays, enabling a comprehensive modeling approach. Their good generalization capabilities make them valuable for predicting passenger flow under failure conditions. To address this problem, LSTMs offer a powerful solution.

In this paper, we only take temporary failures (breakdown for serval hours) into consideration. When forecasting passenger flow in a subway system during unexpected failures, traditional LSTM models tend to overlook the influence of long-term factors, such as the impact of failures that took place weeks or months ago on passenger flow. These factors are challenging to incorporate into future predictions. In subway systems, serious failure events occur infrequently, have multiple causes, and lead to various outcomes. Therefore, the LSTM time series model is unable to forecast the inbound passenger flow during such events. This research proposes a solution by merging the attention mechanism with the LSTM model. The attention mechanism, which finds widespread use in mechanical fault diagnosis, saliency detection, crowd counting, and facial expression recognition, elevates the relevance of significant features in passenger flow prediction. This renders the model more responsive to changes in critical elements. This paper introduces an LSTM model with a combined attention mechanism, known as the A-LSTM model, which raises the efficiency and precision of incoming passenger flow prediction.

To construct the A-LSTM model, we follow a three-step process. Firstly, we carefully select the optimal depth of the LSTM network to ensure high accuracy and short prediction intervals for short-term inbound passenger flow prediction under normal conditions.

Secondly, we implement the attention mechanism with a control window to adjust the internal structure of the LSTM and reweight the training parameters. This enables us to accurately predict short-term inbound passenger flow in sudden failure events.

Lastly, to apply this model to inbound passenger flow prediction at different stations, we use the N-day rolling update of inbound passenger flow data of the target station as the training parameter. This involves training the A-LSTM model with data from the previous N days and using it to predict short-term passenger flow in both normal and failure scenarios.

4.2. Stage 1: Double-Layer LSTM Network

For short-term passenger flow prediction, we conduct the performance analysis on the database of the City S subway system to decide the structure and hyperparameters of the double-layer LSTM network, which will be explained in Section 5.3. In this study, a double-layer LSTM network was selected, with the state activation function being tanh and the activation function of the control gate being sigmoid. Each LSTM layer is followed by a dropout layer, and finally, the fully connected layer and the regression output layer are connected.

This study uses type of the failure events, inbound passenger flow, time, date, peak hours, weekends, and the number of transfer lines in the previous N days as input variables , where is the time point. The neuron unit in the LSTM layer, as shown in Figure 1, will be used to train the input variables .

and are the values of the output unit and the status unit at , respectively, at the th LSTM unit; is the memory cell; , , and denote the forget gate, update gate, and output gate, respectively. The memory cell has the self-loop function in the LSTM network, so as to realize the iteration within the network. The forget gate determines the self-loop weight of the status unit through the sigmoid function .where , and are input weights, recurrent weights, and the biases for the forget gate unit. And the sigmoid function is defined by

The memory cell , the update gate (adding features to the state unit), and the output gate (determining whether shutting off the output) are computed in the similar way as the forget gate:

Then the internal state of the LSTM memory cell is updated as follows:

After calculating the internal state , the output can be computed bywhere tanh function is defined as

4.3. Stage 2: Attention Mechanism

In order to enhance the precision of predicting inbound passenger flow during unexpected disruptions, this research incorporates an attention mechanism in a double-layer LSTM model. Given that the number of samples for inbound passenger flow during disruption events is limited compared to regular scenarios, the attention mechanism reinforces the model’s retention of these events. The attention mechanism is positioned in the center of the double-layer LSTM model, accompanied by a control gate integrated into the module. Upon the occurrence of a disruption event, the control gate is triggered until the effects have completely subsided. The attention mechanism fine-tunes the LSTM network parameters’ weights to ensure accurate forecasting of inbound passenger flow during the disruption phase.

In machine learning, the attention mechanism assigns different weights to input parameters to highlight their effectiveness. This study involves incorporating the attention mechanism into a double-layer LSTM model. Once activated, the attention module revises the weights of the double-layer LSTM network. The new weight assigned to the th parameter at time , taking into account the influence features of the sudden failure event, can be expressed as where is the intermediate energy term between the parameter to be encoded and other parameters in the output vector of the training network at time , which is calculated by function (8); is the total length of the time series of the input variables.where is the hidden layer output value of the LSTM model at time and , , and are learnable parameters.

is the output value of the double-layer LSTM model based on the attention mechanism at time from the th LSTM memory unit, which is the output of the reweighted LSTM network, and also the input value at time .where is the default output of the LSTM network unit.

The attention mechanism is trained separately with the dataset under failure events including reduced transport capacity, delay time, date, time, number of lines in the station, and inbound passenger flow. Combining with the train attention mechanism, the parameters of double-layer LSTM model can be reweighted under failure events. Therefore, the A-LSTM saves training time and enables accurate short-term inbound passenger flow prediction during such events.

4.4. Stage 3: N-Day Rolling Update

For precise predictions across all subway stations, we can train the A-LSTM model for each station individually. This involves feeding the model with parameters specific to the station from the previous N-day, allowing a double-layer LSTM network to create a short-term inbound passenger flow prediction system. Regular training of the attention mechanism module with standardized failure events from the subway network is necessary to enhance its ability to adjust parameter weights during failure situations. As the normal inbound passenger flow for different subway stations is relatively stable and periodic, it is crucial to determine the appropriate number of days for a training data collection cycle and use updated rolling data for short-term inbound passenger flow prediction.

The overall A-LSTM model structure is shown in Figure 2.

5. Experiment and Simulation

5.1. Data

This research paper analyzes data from the world’s largest subway system, the City S subway system. The study specifically focuses on the inbound passenger flow and failure events data of the system and assesses the effectiveness of the A-LSTM model. The data were collected over a span of two years (2019–2021), with the inbound passenger flow data gathered every 10 minutes through the Automatic Fare Collection (AFC) system. Meanwhile, the delays caused by failure events were obtained from the official statistical reports of the subway. Ultimately, the research is able to predict inbound passenger flow at a 10-minute interval, a method in line with recent studies that aim to predict short-term passenger flow.

5.2. The Influencing Features of Failure Events on Transport Capacity and Delay Duration

During the period of 2019 to 2021, City S’s subway system experienced a total of 378 incidents that impacted its transport capacity. Statistical features of these incidents and their effects on transport capacity and delays are displayed in Table 1. These features serve a dual purpose. Firstly, they are utilized as parameters to train the attention mechanism module. The historical dataset of these incidents is used to train the module, which then adjusts the parameter weights of the LSTM network. Secondly, when estimating the inbound passenger flow range during failure events, the corresponding maximum and minimum values that impact transport capacity and delays are selected as input parameters based on the failure event types.

5.3. Performance Analysis of the LSTM Structure

The performance analysis is conducted on the database of the City S subway system to decide the structure of the LSTM network. The results show (in Table 2) that the single-layer LSTM model lacks the ability to fully comprehend the features of the time series, which results in prediction errors. Using an LSTM model with more than three layers can lead to errors and longer calculation time due to overfitting. Therefore, a double-layer LSTM model is applied in this study.

The parameter settings for the LSTM network in this paper were determined based on the best performance achieved through a grid search. The learning rate ranges from 0.001 to 0.01, epochs range from 100 to 300, hidden neurons include 128, 256, and 512, batch sizes consist of 32, 64, and 128, and the dropout rate varies from 0 to 1. The optimal parameter configuration determined through this search consists of a learning rate of 0.001, 250 epochs, 128 hidden neurons, 32 batch size, and a dropout rate of 0.2.

5.4. Rationality Analysis of 30-Day Rolling Update Data

This paper examines the periodicity and stability of inbound passenger flow in a subway station. It suggests that data collected within a 30-day period can provide an accurate representation of recent inbound passenger flow characteristics. The research aims to predict the inbound passenger flow of a single subway station at a 10-minute interval. To achieve this, the study focuses on the subway stations in City S. Specifically, it selects three different stations, namely, Century Avenue Station, Huaqiao Station, and People Square Station. Century Avenue Station is a significant transportation hub located to the east of Huangpu River and has four subway lines. Huaqiao Station, located in the suburbs, is mainly used for commuting and is not a transfer station. People Square Station, located in the city center, has three subway lines and is surrounded by famous tourist attractions. The study analyzes inbound passenger flow data at a 10-minute interval from July to November 2021 to determine the rationality of the data. The results are shown in Figure 3.

Based on a rationality analysis, the median passenger flow levels at Century Avenue Station, Huaqiao Station, and People’s Square Station have been found to be stable, with fluctuations occurring in the upper limit. At Century Avenue Station and Huaqiao Station, outliers are distributed relatively evenly, whereas at People’s Square Station, they fluctuate significantly, particularly in July and October. This is due to People’s Square Station being a critical transportation hub for tourism, resulting in a high number of passengers during the summer season and National Day. As a result, using the 30 day rolling update data is sufficient for training the network for predicting inbound passenger flow.

5.5. Model Training

To ensure the proposed A-LSTM model’s reliability, we utilized MATLAB software for modeling. Our model incorporated various independent variables, such as date, time, peak hours, weekends or statutory holidays (represented as 1 for yes and 0 for no), number of connecting lines, maximum and minimum values of transport capacity, and delay under failure events. The dependent variable included the inbound passenger flow data at 10-minute intervals. Prior to being combined with the double-layer LSTM network, the attention mechanism module was trained separately. We standardized the data before entering it into the module, using a dataset of 378 failure events (from 2019 to 2021) to train the attention mechanism module. Three failure events that lasted over 30 minutes were selected as testing samples and removed from the training dataset. Their specific information is displayed in Table 3.

In order to predict inbound passenger flow, the A-LSTM model needs to be trained using 10-minute data from inbound passenger flow from the 30 days leading up to the target station’s predicted day. When a sudden failure event occurs, the event type is manually identified, and the relevant extreme values that impact transport capacity and delays are matched. These maximum and minimum values are then used as independent variables in the A-LSTM model, which activates the attention mechanism to predict inbound passenger flow every 10 minutes during the failure event.

6. Results and Analysis

6.1. Normal Conditions

Under normal circumstances, the transport capacity and delay have a maximum and minimum influence of 0, which means that the attention mechanism’s control gate remains inactive. The calculation is conducted independently by the double-layer LSTM network. To verify the results, this study selected two stations as examples.

The first station is Lianhua Road Station in City S, a nontransfer station. The study predicts the inbound passenger flow for November 2021. The network was trained on 88% of the data, and the remaining 12% were used for prediction. The result, shown in Figure 4, was an RMSE (Root Mean Square Error) of 44.2112.

Then, the inbound passenger flow of Guilin Park Station (double-line transfer station) of City S in May 2021 is predicted. 88% of the data are selected to train the network, and the remaining 12% of the data are applied for prediction. As shown in Figure 5, the RMSE = 13.2782.

This study proposes a more accurate prediction of inbound passenger flow under normal conditions compared to previous studies. The RMSE for a 15-minute interval in Hao et al. [4] was 21.91 and that for a 10-minute interval in Liu et al. [1] was 65.38, both of which are higher than the accuracy achieved in this study.

6.2. Failure Events

The attention mechanism will be activated during prediction under failure conditions. We designate the failure occurrence on November 12, 2021, at Lianhua Road Station in City S as the initial validation dataset. This station is a nontransfer station during nonpeak periods. At 10:13, there were multiple trip-outs of the multisection catenary, resulting in a power loss and incomplete operation of the subway line in that section. The starting point for fluctuation predictions is the time when the failure impacts the train operation.

In order to anticipate the number of incoming passengers during failures, data from eight previous instances of similar failures are inputted into the attention mechanism module. Meanwhile, data from the thirty days prior to the failure event are utilized as training data for the A-LSTM model. When a failure occurs, the maximum and minimum values for transport capacity and delay must be manually selected based on the type of failure event. For example, on November 12, a power supply failure occurred during nonpeak hours on weekdays, and the maximum and minimum values for transport capacity and delay were determined as and . After inputting these values and other relevant factors into the A-LSTM model, a prediction, shown in Figure 6, for the range of incoming passengers can be made under the given failure condition.

On May 20, 2021, at 9:44 am, a water pipe near the tunnel in Guilin Park Station burst and caused flooding. After 45 minutes, it disrupted train operations. This event tested the A-LSTM model’s ability to predict inbound passenger flow in different types of stations during such disruptive events. Although the flooding initially had no impact, it eventually caused significant disruptions, leading to the modification of train routing for the entire line. In this case, the starting point for fluctuation predictions is 10:30. The prediction results are shown in Figure 7.

During peak hours on June 26, 2019, at 6:10 am, a turnout positioning fault occurred at City S Railway Station. This led to a 50% failure in the operation of the entire line, resulting in the use of buses to transport passengers for a period of time. The issue was caused by a signal and telecommunication malfunction. To assess the transport capacity and delays caused by this issue, the A-LSTM model was utilized with input of maximum and minimum values. The prediction results are shown in Figure 8.

In this study, the RMSE (Root Mean Square Error) and the MAPE (Mean Absolute Percentage Error) are used to measure the results of the prediction. The calculation formulas are as follows:where and are denoted as the observed value and the predicted value of inbound subway passenger flow at interval , respectively, and is the total number of evaluation samples.

Table 4 shows that the A-LSTM model can predict changes in inbound passenger traffic at various stations during different failure events. The model can also estimate upper and lower bounds for passenger flow fluctuations. Despite different levels of passenger flow at each station, the variance between the upper and lower bounds and actual passenger flow is minimal, with a difference of less than 23 people. Compared with the benchmarks in Table 5, this indicates high accuracy in forecasting short-term passenger flow within a 10-minute timeframe.

However, the MAPE cannot be calculated for Lianhua Road Station due to the absence of passenger flow. The station experiences minimal traffic, and the system failure occurred during off-peak hours. In Figure 7, some actual passenger flow values exceed the maximum predicted values, leading to a high MAPE. There are two reasons to explain. Firstly, the infrastructure failure is small probability event for subway, and as a result, the model may not have learned enough to account for them. Secondly, this particular failure took place during off-peak hours and at a station with a low volume of passenger flow. This has led to the MAPE value being amplified due to the percentage. Nonetheless, minor deviations in passenger flow predictions with only a few results outside the expected range are unlikely to significantly impact subway operations. Furthermore, the prediction results for City S Railway Station demonstrate excellent accuracy for both RMSE and MAPE, making them useful for effective subway station management.

6.3. Comparison Experiment

To further validate the accuracy of the A-LSTM model, a comparative experiment was conducted among widely utilized prediction models, including Multilayer Perceptron (MLP), Random Forest (RF), and Convolutional Neural Network (CNN). The data from City S Railway Station during failure events on June 26, 2019, were employed for testing and comparison. All three models were integrated into the A-LSTM framework, replacing the LSTM network, to predict passenger flow under failure conditions. Figure 9 illustrates the performance of each model and reveals that the A-LSTM achieves higher accuracy compared to the others. This evidence emphasizes the efficacy and reliability of the A-LSTM model in improving the precision of short-term passenger flow predictions, particularly in challenging scenarios such as failure events.

7. Conclusions

Combining an attentional mechanism and a dual LSTM network, the A-LSTM model is used to learn the characteristics of fault conditions and to predict the incoming passenger flow at 10-minute intervals at subway stations. A 30-day rolling dataset of target stations is used as training parameters, and each station can be customised for learning and prediction. Input variables include failure event type, inbound passenger flow, time of day, date, peak hour, weekend, and number of interchanges. Temporal and spatial characteristics of fault events as well as impact characteristics are considered in the model. In addition, the A-LSTM model accurately predicts the trend and fluctuation range of the short-term inbound passenger flow by constructing the mapping relationship between short-term inbound passenger flow and the change of subway transport capacity, delay, and other factors.

The accuracy of the whole A-LSTM model is verified by the observed data of City S. When predicting the passenger flow within 10 minutes under different fault events in different subway stations, the predicted values of the passenger flow trend are consistent with the observed values. The RMSE of the upper and lower bounds compared with the observed value was less than 25 people, which is highly accurate. Nevertheless, a few actual passenger flow values exceed the maximum predicted values, leading to a high MAPE, which is caused by the small passenger flow scale and small probability failure event. The accuracy of the proposed model is verified on large passenger flow scale stations.

Future research will be focused on the following areas. Initially, the change in passenger flow on the subway line network under a fault event will be investigated. Next, the influence of complex factors on the regulation of inbound passenger flow changes under fault events will be explored and the correction equations will be established to improve the prediction accuracy. Thirdly, the propagating regulations of each failure on the subway network will be concluded and the passenger flow prediction of the network under failures can be predicted. In addition, the A-LSTM model has achieved high-precision short-term prediction under most fault scenarios. However, its ability to cope with extreme situations has not been verified due to the lack of data samples. Through intensive training of the attention mechanism, extreme patterns will be generated and the fluctuation range of inbound passenger flow in extreme cases will be predicted.

Nomenclature

:The matrix of maximum and minimum predicted inbound passenger flow
:The inbound passenger flow of previous time steps
:The inbound passenger flow at time
:Date of a month
:Time of a day
:The transport capacity reduced due to the failures at time
:The delay during failures at time
:Input variable of the proposed model at time
:The forget gate at the th LSTM unit
:The update gate at the th LSTM unit
:The output gate at the th LSTM unit
:The memory cell in LSTM
:The input weight for the forget gate unit
:The recurrent weight for the forget gate unit
:The bias for the forget gate unit
:The sigmoid function
:The internal state of the LSTM memory cell
:The new weight assigned to the th parameter at time by the attention mechanism
:The intermediate energy term in the attention mechanism
:The hidden layer output value of the LSTM model at time
:The output value of the double-layer LSTM model based on the attention mechanism at time from the th LSTM memory unit.

Data Availability

The dataset is privately owned by a metro company. The dataset is available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.