Abstract
Changes in local transit passenger flow may cause a spatial spillover effect across the involved regions and affect traffic patterns in other regions. To identify the affected areas and the traffic patterns, this study develops an enhanced spatial vector autoregressive (SpVAR) model to investigate relations in public transport systems in the case of sudden large passenger flow impact. The proposed model captures the interacted correlation within different transit models in separated regions. Three representative commuting regions in Beijing, namely, Zhongguancun, Guomao, and Huilongguan, are employed for empirical study. Results confirm the existence of spatial spillover effect in the commuter regions and reveal heterogeneous effects of multimodal transit system on regions with different distances.
1. Introduction
In recent years, the transportation infrastructures have been significantly improved in China which strengthens the transfer capabilities between different transit modes. As a result, some types of changes in passenger volume taking one specific transit mode in some areas or regions are highly likely to affect other transit modes in other unconnected regions. Such phenomenon is known as a spatial spillover effect, which may reveal some spatial conduction laws in the multimodal transit system. A better understanding about the spatial spillover effect within multimodal transit network could help us to predict some unexpected events and further to plan the necessary countermeasures.
To this end, numerous scholars have made efforts to explore the contributory factor which may lead to the spatial spillover effect. The results suggest that the interaction among the multimodal traffic flow (passenger flow) is a major factor [1–7]. Then, a variety of statistical models, especially regressive models, have been developed to better observe and capture such interactions.
Although a number of enhanced spatial econometrics models have been proposed to reflect spatial and time features, they typically can only analyze the case of single endogenous variable and lack of sufficient capabilities to investigate the correlation among multiendogenous variables. Such limitation led those enhanced VAR model to lose the capability of analyzing the spatial spillover phenomenon. To address this issue, Conley and Dupor [8] and Neusser [9] further improved the framework of spatial VAR model again for capturing the spatial spillover effects considering temporal and spatial correlations, which later contributes to developing the spatial vector autoregressive (SpVAR) model. The developed SpVAR model is thus capable of capturing the actual impact of regional factors and reflecting the regional spatial relations [10–12]. Although various types of SpVAR models have been developed in recent years, the application of spatial econometrics in transport fields is limited, particularly to study the passenger flow patterns within multimodal transit system due to the fact that all the existing SpVAR models lack transportation elements, so they are incapable of reflecting the impact of spatial distance on change of interregional transit. Deng [13] developed a spatial vector autoregressive (VAR) model to predict traffic flows within a simulated system. Chen et al. [14] constructed SpVAR models at typical analysis periods for volume and speed forecasting by considering different combinations of upstream and downstream impacts. However, there is a research gap in using SpVAR model to study the traffic flow patterns within multimodal transportation system.
To fill this gap, this study proposed an enhanced SpVAR model to better understand the relationship among different types of passenger flows in multimodal transit system, specifically to explore the spatial spillover effect in multimodal transit system. In detail, a weight matrix of spatial factors is firstly embedded for including linear interdependence among multiple spatial time series variables. Furthermore, another new feature, impulse response, is also designed and included for generating the dynamic response of interregional passenger flow when change of passenger flow is in other regions and other modes. As a result, the enhanced SpVAR model is assigned with spatial and geographical attributes to address the interregional and temporal correlations of passenger flow in the multimodal transit system. In addition, the proposed model can detect multivariable interactions and explore spatial correlations among different transit modes in the multimodal network. Specifically, the model can capture the response of each mode in one specific region in case of the changes of other modes in other regions. To summarize, the improved model with the impulse response is assigned with the following capabilities:(1)It is able to measure the current value and also to predict passenger volume in other regions after a unit passenger flow impulse is generated in a certain area.(2)It is able to detect the interactions in multivariables and spillover effect in different regions under the change of passenger flow in the specific transit mode.
The rest of the paper is organized as follows. Section 2 describes the structure and format of the data. Section 3 introduces the structure of the SpVAR model, including identification of parameters and spatial weights, solution of impulse response, and application to the interaction of passenger flow. Section 4 presents the three representative traffic analysis zones (TAZ) in Beijing selected using the average daily passenger flow of bus alighting volume, metro boarding, and alighting volume per 15 minutes.
2. Data Source
To better analyze the spatial spillover of multimodal transit demands, this paper discusses the intraregional and interregional interaction of bus and metro flow by using real integrated circuit card (IC card) data in Beijing.
Beijing is the capital of China and one of the megacities with a population of over 21.71 million.
Figure 1 shows Beijing that is divided into 1911 TAZ according to the socioeconomic data and the principle of homogeneity and uniformity; the red rectangle represents the studied area.

In this paper, the proposed improved SpVAR model is constructed using time-panel passenger flow data in Beijing. This study investigates the interregional spillover effect of passenger flow by bus and metro in the selected areas. In Beijing public transit system, passengers are required to tap in and tap out their smart cards during both boarding and alighting procedure, and thus Metro boarding volume, metro alighting volume, and bus alighting volume are obtained from the Automatic Fare Collection (AFC) installed on the metro station and bus. The proposed model variables are metro boarding volume, metro alighting volume, and bus alighting volume every 15 minutes in each TAZ. Point of Interest (POI) data are also collected to quantify traffic attraction with each TAZ. The typical POI information can be categorized into land use and transport. In land use category, the densities of residential building, employment, hotel, service facility, attraction, and commercial establishment are measured, while in traffic category, bus stop density, metro station density, and road density are calculated in each TAZ.
For the specific research objects, we choose three typical and representative TAZs in Beijing. The criteria of setting up these TAZs include similar census block information (e.g., household and employment densities) within each TAZ. These zones are marked with the red box in Figure 1 and include districts surrounding Zhongguancun (hereinafter referred to as ZGC), Guomao (hereinafter referred to as GM), and Huilongguan (hereinafter referred to as HLG) (Figure 2). The straight-line distance between ZGC and HLG is 16 km and that between GM and HLG is 21 km, and the network distances between ZGC and HLG as well as HLG and GM are 19 km and 32 km, respectively.

GM and ZGC are business office centers and the morning commuter destinations in Beijing. HLG is one of the largest residential centers and the origination of early peak commuters in Beijing. The traffic among the three regions constitutes the main commuter flow in Beijing.
3. Model
3.1. Spatial Vector Autoregressive Model
This study improves SpVAR model to incorporate space and time dimensions as well as the spatial-related disturbances by adding spatial weight matrix and impulse response function.
The SpVAR model introduces the spatial dimension into the parameters of the SVAR model, which is capable of capturing both spatial and temporal dynamics of multimodal passenger flows. However, the spatial weights of SpVAR model commonly rely on distances, which may not hold true to model interregional impulse effect among multiregion transport demands [15].
The model contains N regions and K variables, and the structure of the model for each variable in each region is shown as follows:where refers to the travel demand of mode in region and represents three variables, namely, metro boarding, metro alighting, and bus alighting flow, respectively . Traffic zones or regions are labeled as , and time periods are labeled as . represents the spatial weights matrix related to passenger flow, and denotes the traffic regional specific effects. Regional specific effects refer to the unique effects determined by certain specific features in each region. For this study, the traffic regional specific effects may be related to the number of traffic facilities, the structure of the network, and so on. For each variable in each region, has a specific value. The parameters of the model include time lag coefficient , spatial lag coefficient , and spatial lag coefficient with time lag . Time lag represents the value of variables in time series delayed from the current time, and spatial lag indicates the value of a variable that is in a different spatial relationship with the variable. Each variable in each region is assumed to be generated by the interaction of the four indicators, namely, fixed effect , time lag values of each variable , spatial lag values of each variable , and time lag values that contain the spatial lag of each variable . Spatial-related error term exists in case . We carried out a series of transformations to produce a relatively easy solution from (2). Details are not shown due to length limitations.where is the coefficient matrix that characterizes the contemporaneous correlation of the SpVAR model. contains the coefficient matrix with a spatial structure and can be expressed as follows:
Expand to yield (8), where is the time lag order, is the spatial lag order, coefficient matrix represents the extent of the influence of variable on variable in the order time lag and order time lag, and denotes the influence degree of the variable on the variable in the time lag.
Full Information Maximum Likelihood (FIML) method is adopted to estimate unknown parameters. This method is suitable for the entire system parameters and, to a certain extent, can deal with all the parameters and variance at the same time. If we can establish the likelihood function accurately, then FIML can estimate all structural parameters by solving the likelihood function of simultaneous equations of the model based on sample observations. FILM estimation can be proved to be asymptotically normal and unbiased if sufficient restrictions are imposed on the parameters.
The coefficients of the model do not have a definite traffic meaning. Therefore, the impulse response is needed for deep analysis. Usually, the estimated SpVAR model mainly analyzes the transmission effect of variables across regions by calculating the impulse response. The impulse response refers to the impact on each explanatory variable when a random disturbance item of the explanatory variable produces a unit of standard deviation shock while other disturbances remain unchanged. The impulse response is used to measure the current value of passenger flow volume in certain area. It is also used to predict the value of passenger volume in other regions when a unit passenger flow impulse is generated in a certain area, allowing us to analyze the degree of interaction between regional transit modes and dynamic equilibrium characteristics. In the SpVAR model of this study, the impulse response mainly refers to the impact of the traffic flow of a transit mode on the passenger flow in a certain area under the following conditions: (1) the impact of the occurrence of regional impact mode; (2) other transit modes in the area of effective impact; (3) impact modes in other areas; (4) other modes in other areas.
The impulse response in the SpVAR model includes general dynamic effect in conventional VAR model and the spillover effect of variables caused by the travel spatial characteristics. In this case, the spillover effect is caused by the spatial hysteresis structure and the spatial autoregressive structure in the model. When the spillover effect is positive, it indicates that the response is the stimulating effect and the spillover effect is negative, indicating that the response is the inhibitory effect.
3.2. Spatial Weight
In the proposed model, each mode is assumed to be related to other modes in each studied traffic zone. Ruan et al. pointed that urban road traffic system is a time-evolving, directed weighted network [16, 17]. The autoregressive coefficients in the SpVAR model are assumed to vary across locations, thereby allowing spatially heterogeneous model dynamics. Spatial weight is added to the SpVAR model to enhance the SVAR model. The regional comprehensive spillover effect value can be decomposed into each region by spatial weight matrix . In this study, spatial weight denotes the extent of the influence of transit mode in region on transit mode in region . We set the spatial weights as fixed over time. In the traditional spatial VAR model, the weight matrix is set as expression (9), where denotes the distance between regions and . Researchers tend to express by calculating the distance between two geographical center points. and represent the spatial scale effects of regions and region , respectively. The spatial scale effect above refers to the specific effect value of each area and is related to some specific space, architecture, economy, and other factors in the region. In this regard, we improve the regional scale effect function by incorporating the characteristics of passenger flow in the model.
The performance of the transport system is related to the general distribution of land. Land use has greatly affected the degree of public traffic collection and distribution, which is the most important factor that determines the attraction of passenger flow [18, 19]. Therefore, we aim to comprehensively characterize the volume of regional public collection and distribution by the function in which the number of POIs in each region is an independent variable. We then fit the number of POIs with actual bus and metro passenger flow by the following logistic regression function:where is a vector of six variables and represents the number of POIs in each region that has been normalized to 0-1. The POI data used in this paper mainly include residential, hotel, entertainment, service, business, and tourist facilities. represents the weights of POIs. The value of changes not linearly with increasing but shows a smooth change. When the value of is relatively large or small, changes slowly, which meets the actual situation in this research.
4. Empirical Analysis
4.1. Stationarity Testing
The primary step of applying SpVAR techniques is conducting unit root test to examine the stationarity properties for each variable. If the variables are not stable, the sequence needs to be differentiated, which may lead to the loss of original information. As a basic criterion of constructing the dynamic regressive model, the cointegration test is then adopted to verify the randomness between nonstationary series and the long-equilibrium relationship in linear combinations.
To implement unit root test, the Augmented Dicky–Fuller (ADF) test is activated in which “the variable has a unit root” is the null hypothesis. After ADF test, the Engle–Granger test is further utilized to examine the cointegration properties of the variables. The results suggest that the residual term of time series is stable as the t-statistic is significant at 1%. Hence, the time series of variables conform to the 0-order cointegration relationship, which satisfies the basic requirement of implementing SpVAR techniques.
4.2. Regression
We develop a SpVAR model for analyzing the metro boarding and alighting volume as well as bus alighting volume in three selected traffic zones.
A number of different types of POIs are used to obtain the spatial weight parameters, as POIs are able to represent as the influential factor to traffic volume. The number of different types of POI, specifically including residential, hotel, entertainment, service, business, and tourist facilities, is aggregated at TAZ.
4.3. Impulse Response Analysis
This section analyzes the impulse response of the SpVAR model for analyzing the transit volume in ZGC, GM, and HLG. The impulse function illustrates the effect of both direct and indirect responses of the change of passenger volume to a shock on the involved transit modes in studied regions. Specifically, a positive response implicates that the shock phenomenon would result in a ridership increase for affected transit mode, and the value of response equals the number of increased passengers. Such ridership increase may be due to a significant positive correlation of travel patterns between two involved regions, in which an increase in one region would easily lead to a correspondent rise to another one. Under such a circumstance, the generated conclusions are important reference for transit operators to take effective measures in advance to prevent unexpected events, for example, a shortage of transit supply, and further to maintain transit system efficiency. On the contrary, a negative impulse response indicates that the ridership in affected region would be reduced due to an appearance of shock in connected region. This decrease may result from a plunge in the transport system efficiency, or the mode shift, or the rerouting behavior caused by a sudden rise of traffic congestion in study area. In this case, the managers are able to dynamic adjust transit supply or allocate more resource to bottleneck for alleviating the problems.
4.3.1. Empirical Results of Impulse Response for Intraregional Transit Modes in Morning Rush Hour
The impact of impulse response on the intraregional transit system is analyzed in this section. In Figure 3, x-axis is the time stamp and y-axis is the impulse response.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)
Figure 3 shows the response of transit modes, including metro boarding/alighting, and bus alighting volumes when the metro boarding volume has a positive shock at 7 : 00 in involved regions. Figure 3 shows that when the metro boarding volume generates a large passenger flow at morning rush hour, the influence of the intraregional metro alighting volume and bus alighting volume is not significant and will not cause major fluctuation. For the response of metro boarding volume, the impulse responses in ZGC and GM are similar, whereas that in HLG is slightly different. This finding may due to the reason that the first two regions are both the business districts, whereas HLG is the residential area. The metro boarding volume fluctuates over y-axis in the morning and has a small peak at 13 : 00 at noon in HLG which may be because a great number of commuters in residential areas take metro to work in the morning and some commuters will return home to relax and then go to work after noon. In business district, appearance of shock in metro boarding volume at morning rush hours does not influence any other mode and will gradually disappear.
4.3.2. Empirical Results of Impulse Response for Intraregional Transit Modes in Evening Rush Hour
Figure 4 shows the response of the transit system within the regions when the metro boarding volume reflects a positive unit shock at 17 : 00. The response in the evening rush hour is similar to that of the morning peak rush hour. In comparison to the metro boarding volume, the response of metro alighting volume in HLG is relatively larger. The impulse response is positive at the beginning of the first half hour, and then Figure 5 presents the interregional impulse response of MA, MB, and BA to MB in the morning rush hour suggesting the spatial spillover effects of metro boarding volume in each region.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(a)

(b)

(c)

(d)

(e)

(f)
Figure 5 shows the SpVAR impulse responses of the transit system in ZGC and GM to the metro boarding flow in HLG at 7 : 00.
Figures 5(a) and 5(d) suggest that a positive shock of metro boarding volume in HLG will bring a positive impact on the metro boarding volume in two regions, and the volume reaches the highest points, 2.12 and 1.44, at 8 : 00 and 8 : 15, respectively.
4.3.3. Empirical Results of Impulse Response for Interregional Transit Modes in Morning Rush Hour
For the response of MA to MB, Figures 5(b) and 5(e) show that a positive shock of HLG metro boarding volume in morning rush hour has a positive impact on metro alighting volume in both regions. This finding may be due to the fact that ZGC and GM are job areas, which would attract a large number of commuters from the residential zones, likely HLG in the morning rush hour. The influence of metro alighting volume in ZGC and GM reaches the maximum at 7 : 45 and 8 : 45, respectively. Notably, the travel times from HLG to ZGC and GM are around 45 mins and 105 mins, respectively, when the trip is started at 7 : 00 AM. These travel times will match the time difference between the peak hours of ZGC and GM (7 : 45 vs. 8 : 45) in the morning peak hour. Such observation is able to be a convincing evidence for proving the spatial spillover effect within three study zones. As such, the transit operators could predict the traffic conditions in ZGC and GM with referring to HLG’s impulse response in advance and further to take effective measures to better contend with any unexpected scenario.
Figures 5(c) and 5(f) show that a positive shock to HLG metro boarding volume has a small negative impact on bus alighting volume in ZGC and little impact on GM. Given that the distance between ZGC and HLG is not too far, bus and metro are competitive with each other, and a part of bus users may shift to taking metro, which then results in a decrease of bus ridership.
Figure 5 shows that the metro boarding flow in region HLG has a strong spatial spillover effect on other modes in ZGC and GM.
4.3.4. Empirical Results of Impulse Response for Interregional Transit Modes in Evening Rush Hour
Figures 6 to 7 present the interregional impulse response form MA, MB, and BA to MB in the evening rush hour. Generally, the spatial spillover effect of the evening rush hour is not significant as that observed in the morning.

(a)

(b)

(c)

(a)

(b)

(c)
Figures 6 and 7 depict the response of the metro boarding, metro alighting, and bus alighting in HLG to a change in the metro boarding volume in GM and ZGC. In terms of the response from MA to MB, the impulse responses are negative at the beginning and then turn positive, reaching the highest point at 18 : 15 and 18 : 45 in ZGC and GM, respectively. Such observation is similar to the case of morning peak. Specifically, after a short period of the metro boarding volume increases in the first hour, the bus alighting volume accordingly increases in GM and decreases in ZGC which lasts for a long time. This finding is also relevant with the property of study zones since the commuters tend to be back home in the evening rather than generating trips from home addresses in the evening.
5. Conclusions
This study proposes an enhanced SpVAR model to explore the time-varying relationships and capture spatial spillover effects between bus passenger volume and metro passenger volume among different regions. The proposed model features in taking spatial weights and the impulse response into account so that it is able to study the propagation mechanism of interregional passenger flow under impacts, such as emergencies. Three representative zones in Beijing, Huilongguan (HLG), Zhongguancun (ZGC), and Guomao (GM), are employed to study and capture the spatial spillover effect and the its mechanism in the multimodel transit system in Beijing. The statistical results confirm the existence of the spatial spillover effect in these selected Beijing commuter regions in which the changes in transit system in one region will generate some significant impacts on other modes in another two regions. Those findings can be applied to predict unexpected events and improve the decision-making process for traffic management departments in terms of warning, guiding, or limiting the passenger flow in case of a sudden large passenger flow burst. For example, public transit agencies will be suggested to increase the dispatching frequency when the model result shows a positive impulse response in the bus boarding patron. The metro company will be warned to take actions to better manage mass metro patron in case of a positive impulse response in the metro system. The main contributions of the paper are as follows: (1) The enhanced spatial VAR (SpVAR) model is proposed and then firstly introduced into analyzing the spatial interactions among the multimodal transit systems. Specifically, a spatial weight matrix is developed to be embedded into the proposed model, while the model is able to explore the impulse response that is used to analyze the impact of the change of passenger volume in one mode and in region on multitransit modes in other separated regions. (2) This paper empirically analyzes the multimodal spatial spillover effects within three separated traffic districts during morning and evening peaks as well as the off-peak periods, to explore the interactions in multimodal transit system.
Although the proposed model is capable of analyzing the time-varying relationships within multimodal system in different regions, some critical issues still need to be further investigated: (1) the accuracy of the proposed model highly depends on the size of test data. As a result, how to improve the capability of estimation techniques under small dataset is one of next research directions; (2) the proposed model is unable to include the features and elements of physical links between regions. Since the spatial weights have been proved to be significantly correlated with physical link between different regions, adding physical link or travel impedance into model and further to verify its effectiveness will be another main topic in the future research.
The future research directions may include (a) the extension of the SpVAR model to leverage dynamic panel data models and capture prior knowledge of some parameters; (b) modeling the nonlinear relationships and interactions of the transit modes and TAZs; and (c) enhancing the SpVAR considering the error correction, time-varying patterns, and the factor augmentation.
Data Availability
All types of data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This paper was supported by National Key R&D Program of China (2016YFE0125000) and the National Natural Science Foundation of China (61903058 and 61773036).