Abstract
Parking volume forecast is an indispensable part of the parking guidance and information system (PGIS), which is an important component of the intelligent transportation system (ITS). The parking volume forecast of railway stations’ garages will provide information support for garages’ management and will also be a great convenience for car passengers. Parking garages of railway stations serve passengers to arrive or depart stations by car, and their arrival or departure behaviours definitely affect parking volumes. The study results showed that different parking behaviours have different characteristics of the parking duration category. Therefore, passenger behaviour analysis based on parking duration category analysis and time series similarity measures was introduced into the forecast model in this research. Also, a novel parking volume forecast model based on the long short-term memory (LSTM) is proposed. In this paper, the parking volume data of public parking garages of Hongqiao Railway Station in Shanghai of China is used to verify the model, and the proposed model makes it possible for the accurate and real-time prediction of parking volumes which are divided into different parking duration categories. Compared with the ungrouped data model and the conventional forecast model, the proposed parking volume forecast model based on passenger behaviours with the LSTM network achieves a better performance and provides more accurate prediction.
1. Introduction
With the increasing availability of large amounts of historical data and the need of performing accurate forecast of future traffic volume, transportation forecast research has been drawing more attention. The parking volume forecast of railway station garages is also very important; it not only provides management decisions for garages’ agencies but also assists drivers to find available parking lots. However, most research studies solved this problem solely as a time series forecast issue and adopted conventional or heuristic methods. The effects of parking behaviour features were not taken into account, which mainly include the price categories and railway schedules. Railway passengers’ behaviours are apparently an important effect to the accuracy of the parking volume forecast of railway station garages.
By analysing passengers’ behaviour features, we found that different parking duration categories have different volume shapes in the sequence diagram, which indicates that different features of passenger behaviours affect the features of parking volume. So, in order to achieve the better accuracy and get more detail of the parking volume forecast, it is indispensable for us to study different passengers’ behaviour features, which are potentially affected by graded pricing categories, railway schedules, and other potential impacts.
In order to deal with the forecast of future traffic volume, many techniques of conversational methods are deployed, which are summarized as follows. The known time series forecast dates back to 1960s of linear statistical methods such as ARIMA models [1–3]. During the same period, Van Der Voort et al. applied the ARIMA model for the traffic flow forecast in urban arterial roads [4]. Some other improved approaches such as Kohonen-ARIMA, subset ARIMA, and vector autoregressive ARIMA were also used for the short-term traffic forecast [5, 6]. Dunning [7] used the ARIMA model to predict the number of available parking lots; the prediction accuracy was very good when the arrival rate of parking lot was low, and the prediction accuracy was low when the arrival rate of parking lot increased. Liu et al. [8] adopted the weighted first-order local area method of chaotic time series to predict the number of available parking spaces near the hospital, and the prediction accuracy will gradually decrease with the increase of the prediction time span. Shi et al. [9] proposed a simple seasonal adjustment approach (SARIMA) for modelling seasonal heteroscedasticity in traffic flow series, and four types of seasonal adjustment factors were proposed with respect to daily or weekly patterns. Rajabioun et al. [10] took into account the relationship between berth change and the spatial-temporal correlation of parking lots in a certain area and used the multivariate automatic regression model to predict the number of available parking lots in the area. Zheng et al. [11] focused on the distribution of the typical parking arrival and departure pattern, a parking demand prediction model was constructed by utilizing the Markov birth and death process, and model parameters were calibrated by utilizing the curve fitting method and undetermined coefficients method. Wan et al. [12] used the time series method (TSM) and regression analysis method (RAM) to calculate the common urban curb parking price for future years, which used the TSM to calculate the change in the value of the independent variables of future years and used the RAM to estimate the relationship between the independent and dependent variables.
As over the past decades, machine learning models have been applied in many domains especially forecast issues and achieved a success, and these models have also been called black box or data-driven models [13]. Heuristic methods were also adopted in the traffic volume forecast on transportation regions. These models are taken as nonparametric nonlinear models which use only historical data to learn the stochastic dependency between the past and the future. For instance, Werbos found that Artificial Neural Networks (ANNs) outperform linear regression and Box–Jenkins approaches [14, 15]. Many machine learning methods have been studied such as nonparametric regression [16], neural network prediction [17], support vector machine (SVM) [18], Kaman filtering [19–21], and the combination of these algorithms [22–26]. The Kalman filter [19–21] is used to predict the traffic state, but it is seldom used to predict the parking lot volume. In order to deal with time series data, Recurrent Neural Networks (RNNs) are considered. The simplest recurrent networks were developed in the 1980’s, and the historical Simple Recurrent Networks introduced by Elman [27] and Jordan [28] were widely used.
In the last years, RNNs have been successfully used again for various applications. This success is mostly due to the performances of LSTMs: Long Short-Term Memory is a special kind of recurrent neural network. It was introduced by Hochreiter and Schmidhuber [29] in order to be able to learn long time dependencies. To solve this kind of problem of transportation, the LSTM method was also used. Zhao et al. [30] proposed the temporal-spatial correlation long short-term memory (LSTM) network in the traffic system via a two-dimensional network, and the proposed LSTM network achieves a better performance. Yuhan et al. [31] introduced the rainfall-integrated deep belief network (DBN) and LSTM to learn the features of the traffic flow under various rainfall scenarios; the experimental results showed that the depth-learning predictor had better accuracy than the existing predictor when the extrarainfall factor was considered. Luo et al. [32] combined the k-nearest neighbour (KNN) and long short-term memory network (LSTM), which is called the KNN-LSTM model, and the experimental results showed that the model had better prediction performance compared with the existing prediction model. Tang et al. [33] proposed a novel forecast model combining spatio-temporal features based on the LSTM network (ST-LSTM) to extract spatio-temporal features from the data, which achieved a better performance in the short-term forecast of rail transit. Wang et al. [34] introduced GBRT to predict shared car borrows and returns at each station within a 3-hour time window, and according to MSE, the models outperform with better accuracy than ARIMA, RF, and NN, but worse than the LSTM; however, when referring to the training speed, ARIMA is the fastest, and the LSTM is the slowest. Ma et al. [35] proposed a combined model based on Genetic Algorithm (GA) and Exponential Smoothing (ES), which can compensate for the deficiency of the single model and combine the advantages of the single model to improve the prediction performance.
Through previous studies, time series forecast methods are summarized as follows. (1) Conversational methods of the time series forecast are represented as the ARIMA method or deformed ARIMA method, which have gradually been a benchmark in newly developed forecast model comparison. (2) Compared with conventional RNNs, the LSTM network is able to capture the features of time series within longer time dependences and has better performance, which has drawn a success in traffic flow prediction. (3) Most studies did not take human behaviour factors into account, such as drivers’ parking behaviours’ features which are inherent with the garages’ features and are hard to be changed easily.
The contributions of this study lie in three aspects. Firstly, this paper introduces the parking behaviour analysis of the railway station to predict parking volumes, which obviously has an impact on the parking volume prediction. Secondly, when classifying the parking volumes according to the parking duration, it innovatively introduces the time series similarity analysis, which makes the parking volume classification more reasonable and more consistent with the parking behaviour analysis. Thirdly, we introduce the LSTM method to predict the parking volume of the railway station, which is different from the current methods.
In view of the abovementioned facts, this paper proposes a new LSTM method based on passenger parking duration categories as passenger behaviour features to forecast the parking volume of railway station garages. The remainder of this paper is organised as follows. Section 1 introduces a general overview of existing literatures on the traffic forecast. The methodology is introduced in Section 2, and the architecture of the proposed LSTM network model is explained as four parts. The case study based on the traffic dataset is shown in Section 3, and comparisons with the ungrouped data model and conventional forecast approaches are also given in this section. The conclusion and future work are at the end of this paper.
2. Methodology
2.1. Passengers’ Parking Behaviour Analysis
As the affiliated facilities of railway stations, parking garages serve train passengers, who arrive or leave stations by car. Their arrival behaviours can be divided into three categories according to the trip chain: to drive by themselves for departure trains, to be transported by others for departure trains, and to be picked up by others for arrival trains. Their departure behaviours can also be divided into three categories according to the trip chain: to drive by themselves for arrival trains, to be transported by others for departure trains, and to be picked up by others for arrival trains. And, passengers’ parking behaviours can be assumed as follows.
Initially, cars’ arrivals are based on train schedules, whether passengers arrive by themselves or by others. Although they have to arrive earlier before the departure train schedules, cars’ arrival volumes would be correlated to train schedules, which means passengers’ arrivals have regularity behind them. Furthermore, cars’ departure volumes depend on car’s arrivals volumes; the latter can be taken as the time-window shifting of the former with the length of the parking duration, so cars’ departure features depend on car’s arrival features and parking duration features. Additionally, the parking duration solely depends on passengers’ behaviour categories. The parking duration of self-driving passengers must be much longer in order to complete their whole journeys and will not leave until they come back by train. The parking duration of sending passengers to depart can be much shorter for just putting down passengers in railway station garages. And, the parking duration of picking up passengers of arrival trains will be discrete as it takes time that passengers walk to parking lots and they find each other. So, different parking durations correspond with different categories of parking behaviours and also different regularities, which means different parking volume forecast regularities. Finally, different categories of parking behaviours are based on different passengers’ personal travel demands, so they are independent of each other.
As stated previously, the parking forecast model in this paper is based on the following content.
The characteristics of passengers’ travel behaviours are mainly reflected in their arrival distributions (based on train schedules) and in parking durations (based on what kind of parking behaviour belongs to). In our paper, the parking duration represented as different parking behaviours will be divided into different categories. And, how to divide the categories of parking durations is based on the graded parking fees, or the concerned passengers’ parking duration, or reasonable clustering methods, etc. We use to denote the parking duration category.
The forecasting variables are arrival parking volume and departure parking volume per chosen time period, which are also divided into different parts according to parking duration categories. We did not use available parking volumes as our variable because arrival parking volume and departure parking volume can provide more information for passengers and garage agencies. We use and , respectively, to denote arrival parking volume and departure parking volume of the parking duration category. So, the arrival parking volume and departure parking volume per chosen time period t can be expressed by the following equations:where denotes the parking duration category, N denotes the total number of the parking duration categories, and and , respectively, denote the arrival and departure parking volume of the parking duration category during the period . As stated previously, the parking volume forecast model based on different parking behaviour features in this paper is shown in Figure 1.

In view of the correlation between cars’ departure features and arrival features, we take the previous arrival parking volume divided by the parking duration as the model input, and the present arrival parking volume and the present departure parking volume as the model output.
Referring to the forecast period and forecast timesteps, we take proper values depending on specific cases. The relationship between forecast period and parking duration categories’ intervals is not determinate, but usually parking duration categories’ intervals are shorter than the forecast timestep, or the forecast timestep can be divided by parking duration categories’ intervals. For example, the forecast timestep is one hour, and parking duration categories’ interval can be thirty minutes or two hours, etc.
2.2. Similarity Measures between Each Parking Behaviour Category
As the parking behaviour categories are inferred by graded parking fees, reasonable clustering methods, etc., the similarity between every category cannot be guaranteed. In order to merge the similar division results, the computation of the similarity between every category is essential.
The list of approaches for dealing with time series similarity is vast; there are several representative examples of time series similarity measures [36]. Euclidean distance, cosine similarity, Fourier coefficients, autoregressive models, DTW (Dynamic Time Warping), EDR (Edit Distance on Real Sequences, TWED (time-warped edit distance), and MJC (Minimum Jump Costs’ Dissimilarity). The Euclidean distance method is not “stable” when the dataset has outliers (i.e., the data is not very standard), and the DTW method needs huge amount of calculation because of large number of paths, and all the nodes of these paths need to be matched. So, in this paper, cosine similarity is used to evaluate the similarity between every category. The equations are expressed as follows:
2.3. LSTM Network for the Parking Volume Forecast
Recurrent neural networks (RNN) are a type of neural network that add the explicit handling of order in input observations. This capability suggests that the promise of recurrent neural networks is to learn the temporal context of input sequences in order to make better predictions. However, due to the vanishing gradient and exploding gradient problems, the RNN cannot cope well with the long-term time series forecasting [27].
As a special kind of RNN, LSTMs are specifically designed to address these problems and perform well on finding the correlation within time series in both short and long terms. LSTM networks are built with the input layer, hidden layer, and output layer. The self-connected hidden layer contains memory cells and corresponding gate units.
The memory unit of the LSTM model is regarded as the hidden layer, which can store information in order to find and exploit long range dependences in time series, and these remarkable functions are realized due to three gate units: the input gate, forget gate, and output gate [2]. Figure 2 illustrates a LSTM network structure with a single LSTM memory cell in this paper.

(a)

(b)
As the forecast variables are arrival parking volume and departure parking volume per chosen time period, we construct two LSTM networks for different outputs, and the LSTM network for the departure parking volume takes the arrival parking volume as the input data. The LSTM networks are shown in Figure 3.

For the version of the LSTM used in this paper, it can be described by the following composite function (equations (3)–(8)):where is the activation function of the gates, usually the logistic sigmoid function, , , are, respectively, the output of the input gate, forget gate, and output gate, denotes the activation of the cell unit, denotes the updated status of the memory cell, , , , , , and are coefficient matrixes where and are, respectively, the cell input and output activation functions, usually functions, and denotes the Hadamard product. The model input data is , and output data are or . Via the function of different gates, LSTM memory units can perform much better than the RNN on the time series forecast problem in both short and long terms.
2.4. Training Algorithm
This section provides two aspects for training the LSTM network: the training for the arrival volume forecast LSTM network and for the departure volume forecast LSTM network. We use the composite function mentioned above for the activation (forward pass) and the backward propagation gradient calculation for weights’ calculation (backward pass), which depends on [29, 30]. Step 1: By the initial weight matrices, we use the composite function mentioned above to do the forward pass and calculate separately. Step 2: We use equation (9) below to train the parameters in order to obtain the minimum losses. The final weight derivatives are obtained by summing over the derivatives, and we will get two sets of network weight matrices for each: where denotes the weight to optimize, denotes the loss function, and denotes the data output separately and the cell output . Step 3: Fine-tune the whole network, and we will obtain the final weights’ matrices.
3. Case Study
3.1. Data Description
The parking volume forecast model in this paper was based on the data of public parking garages of Hongqiao Railway Station in Shanghai of China, and the data were collected by the garage management agency. The parking data were collected separately by the cars’ arrival data and cars’ departure data. So, we can learn every car’s arrival moment and departure moment and calculate the parking duration of every car. There are 685,593 samples for 33 days, from 3 am 4 March, 2019 to 2 am 6 April, 2019. The first group of three weeks was used to train the model, and the last group of one week to test the model.
Based on the graded parking fee categories, we divided the whole samples into eight parts according to eight parking duration categories: “<15 min,” “15 min∼1 h,” “1 h∼2 h,” “2 h∼4 h,” “4 h∼6 h,” “6 h∼8 h,” “8 h∼24 h,” and “>24 h.” Figure 4 shows parking arrival volumes of different parking duration categories on one week.

(a)

(b)

(c)

(d)
We can find that arrival volumes of eight parking duration categories separate with others well and have different shapes, and categories 1 and 2 have the relatively large volumes, and categories 6, 7, and 8 have the smaller volumes (see Figure 4(a)). Categories 1 and 2 have the greater ratio of volume, which is 80.32% of the sum of eight categories, and categories 6, 7, and 8 only have 7.88% of total. However, the sum of parking duration of the category 8 (“>24 h”) has 56.79% of the ratio of the total parking duration, and its volume ratio only has 3.20%. So, long-term parking has more influence for the garages’ efficiency in some ways. And, the peak arrival of long-term parking occurred in the morning, as short-term parking in the afternoon. So, if there arrive more long-term cars than usual in the morning, they will occupy the parking lots for the whole day, which means that the odds are smaller for short-term cars in the afternoon to find available parking lots. The parking volume forecast based on different parking duration is essential and meaningful.
3.2. Similarity between Each Parking Duration Category
As shown in Table 1, the similarity degree between every two categories can be shown clearly. Referring to the reality volume (see Figure 4), the volume range of categories C1, C2, C3, C4, and C5 separately are (0∼871), (0∼1092), (0∼342), (0∼62), and (0∼27), so we merge these categories into two: D1 (C1, C2, and C3) and D2 (C4 and C5). And, according to Table 1, we merge eight categories into three: D1 (C1, C2, and C3), D2 (C4 and C5), and D3 (C6, C7, C8). This kind of gathering also corresponds with the arrival behaviour features of passengers.
3.3. Determination of the LSTM Network
We, respectively, divide the arrival and departure parking volumes into three parts according to three parking duration categories (D1, D2, and D3). As all the boundaries of parking duration categories can be divided by an hour or an hour can divide it, we use an hour for the forecast timestep. We construct two LSTM networks, respectively, for arrival and departure, and train them separately. The number of memory units and layer can be decided by trial and error.
As the first train to depart is at 6 am and the last train to arrive is at 11:48 pm, the modest time margin is three hours [37]. Ma et al. [38] showed that 95% of the investigated passengers had a waiting time of less than 200 minutes. We chose the time period from 3 am March 4, 2019 to 2 am April 1, 2019 as the study time range, which has 28 days for 4 weeks, and it begun with Monday. We use the first three weeks to train the model and the last week to test the model.
3.4. Evaluation for the Forecast Result
In general, three criteria are used to evaluate the performance of forecast results: mean absolute error (MAE), mean square error (MSE), and mean relative error (MRE). As the MAE and MSE are more sensitive to the raw data, so, in this paper, MRE is utilized to evaluate the prediction ability. The definition is as follows (equation (10):where is the actual data and is the forecast data. As to the raw data of 0, this part of data is deleted. Taking category D1 as an example, “0” accounted for 0.219%.
4. Results and Discussion
We choose three parking duration categories to show the forecast results. The parking duration category D1 has the largest volume ratio of the whole samples. The parking duration category D2 has the volume less than 200 per hour. The parking duration category D3 has the largest parking duration ratio of the whole data. Figure 5 shows the forecast results and the original data, which correspond to parking duration categories D1, D2, and D3. Three weeks were used for training the model, and one week used for testing. The x axis of Figure 5 is just for plotting convenience and does not mean the real hour labels.

(a)

(b)

(c)
In Figure 5, we can find that the parking volumes have very different shapes, and the comparisons show that each of forecast data has the similar shape of its original data, which means that the LSTM network works well on forecasting parking volume on the long-term time span. The MREs are 11.07%, 16.25%, and 14.28%. The LSTM model really performs well on this issue. The MREs of the parking duration category D2 is larger than others and that is because the arrival volume is smaller than others, so even the forecast data is closer, and the MREs will be larger. For example, if the real volume data is 1 and the forecast data is 0.5, then the MREs will be 50%, but it will not affect the practical application of the model.
In order to compare the results of grouping and ungrouped data, we applied the LSTM to the whole data without grouping and drew a comparison chart, as shown in Figure 6.

As shown in Figure 6, predict 1 and predict 2 represent the LSTM prediction results of the sum of D1, D2, and D3 and without grouping data, which has the result of MREs of 12.71% and 16.33%, respectively. It can be found that the prediction shape of the two methods is very close to that of the original data, but the prediction result of the sum of D1, D2, and D3 is better than that of the nongrouping model, and its MREs is 3.62% less. Therefore, the prediction results based on parking behaviour grouping cannot only provide more detailed information but also have better prediction results.
In order to further investigate the prediction results of the LSTM, this paper adopts cross validation. Cross validation is a process of developing models based on subsampling data from validation sets. Its goal is to determine the expected accuracy level and error range. As to cross validation, time series are slightly different from nonsequential data. Specifically, the time dependency on previous time samples must be preserved when developing a sampling plan. We can create a cross validation sampling plan by offsetting the window used to select consecutive subsamples. In finance, this type of analysis is often referred to as “backtracking testing,” which splits a time series into multiple uninterrupted sequences that are offset in different windows. These sequences can be used to test the current and past observations. In this paper, the overall data is divided into six slices (the time periods are shown in Table 2), and all of time intervals are 28 days. The prediction results of slice 1 are shown as Figure 5, and the rest prediction results are shown in Figures 7–9.

(a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

(e)
According to Figures 7–9, we found that the predication results are relatively stable, and the proposed method is effective and reliable for the parking volume forecast of railway station garages in practice. MREs of each slice and category are shown in Figure 10. It can be seen that the average MRE values of D1, D2, and D3 are as follows: 12.94%, 17.29%, and 15.30%. And, the parking volume of D1 is high and D3 is medium, whereas D2 is low. And, the prediction results have a certain relationship with the parking volumes. When we do the parking duration categories grouping, we should consider the parking volume factor.

In order to validate the efficiency of the proposed LSTM network, we also compare it with the conventional time series forecast model of the Seasonal Autoregressive Integrated Moving Average (SARIMA) method. Unfortunately, the acceptable result cannot be obtained. The experiments are shown in Figure 11.

(a)

(b)

(c)
The same original data of parking duration categories D1, D2, and D3 are used, and SARIMA (1, 0, 1, and 24), SARIMA (2, 0, 2, and 24), and SARIMA (1, 0, 2, and 24) were formulated, but the learning period of one week shows to be much longer for the SARIMA method, even for the 24 hours; as shown in Figure 6, the SARIMA cannot learn features well, and the MREs are larger than the acceptable value (30%). The input data of SARIMA has already eliminated periodicity and seasonality. This is also the inconvenience part for the SARIMA model, and the LSTM model just needs the standardization and normalization of input data.
5. Conclusions and Future Work
In this paper, the authors proposed the parking volume forecast model of railway station garages based on passenger behaviour analysis by using the LSTM network. For different passengers’ parking durations represent different parking behaviours features and different travel behaviour features, parking durations were divided into different categories according to the graded pricing categories in the model and then were merged into three categories by using similarity measures (cosine similarity), and finally, we forecast the parking volume separately by the LSTM method and predicted in the future. In order to reduce the influence of randomness of the training set and test set method on the model, we adopt the method of cross validation, and we get the stable prediction results. We compared the prediction results of the parking duration category with grouping and without grouping and found that the grouping method has the better performance, and the MRE results are 3.62% less than the ungrouped method results. We also compared the results with the conventional SARIMA method, which lost efficacy in long-term forecasting. Our method achieves better results and provides more detailed information, which provide better information support for both garage agency’s management and passengers’ travel decisions. The parking volume forecast model we proposed was testified to be effective.
The study mainly focuses on the parking volume forecast of different parking duration categories, but also can be used for the rationality test of parking fee changes and other practical aspects. As a future work, the authors will try to apply the model on other issues to verify the practicability of the proposed model.
Data Availability
The data used in this paper were collected by Data Centre of Garage Management Agency of Hongqiao Railway Station. The data can be availed by contacting the corresponding author.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This research was supported by the Shanghai Science and Technology Committee (no. 17DZ1204003).