Abstract

As a crucial part of the urban system, road networks play a key role in the evolution of the urban structure. Therefore, studying the structural characteristics of urban road networks is pivotal for improving the efficiency of traffic network nodes and for relieving traffic pressure. This paper applies an urban road network analysis method to measure the centrality of the multiscale road network in Shenzhen, China. Taxi GPS data from October 17 to October 23, 2017, were selected for analysis of spatial distribution characteristics. This paper also established a regression model of taxi pick-up and drop-off frequency and road network centrality for further analysis. Several interesting observations were made. With respect to the increasing search radius, the closeness centrality indicator shifts from a multicentered distribution to a single-centered distribution, while the betweenness centrality indicator shifts from a patchy distribution to a distribution along the main roads. In addition, the straightness centrality indicator turns from a dispersed distribution to a point-axis distribution, concentrated in the southern part of the city. Second, there were variations between the centrality of the road network and the location of taxi pick-up and drop-off points. The regression model gets the highest value of R2, indicating a significant correlation in cases where the search radius is 3 km. Finally, the relationship exhibits a clear positive correlation between the betweenness centrality and taxi pick-up and drop-off points. On the other hand, closeness centrality is not correlated with these points. The straightness centrality has a negative correlation with the frequency of taxi pick-up and drop-off at 3 km and 8 km scale.

1. Introduction

As urban road networks are crucial for any city, evaluating the road network structure may aid in improving the efficiency of urban traffic. In the field of urban traffic, it is a common practice to use indicators of road network centrality to locate important traffic hubs or road sections.

Such research focuses on urban structure from a network perspective, and thus, studies the relationship between roads and socioeconomic activities. For example, Crucitti et al. studied the network structure and distribution characteristics of 18 cities with the help of road network centrality [1]. Using the same method, Wang et al. examined the structure and centrality of the Chinese air network [2]. Moreover, the relationship between road network centrality and socioeconomic activities such as retail and service density was explored from a locational perspective by Porta et al. Porta et al. also examined the relationship between Barcelona’s road network centrality and diverse types of economic activities [3]. This research work claims that urban traffic should be studied as neighborhood centers rather than as boundaries in urban planning [4].

The aforementioned research uses road network centrality indicators to investigate the link between accessibility and structural characteristics of the transportation network [5, 6]. For example, Wang et al. examined the correlation between land use and indicators of transportation network centrality in Baton Rouge [7]. In addition, Zhou et al. established prospective multimodal transportation hubs in the Belt and Road Initiative network. In their paper, they constructed a multimethodological model based on network centrality and a gravity model [8]. Lastly, using the network centrality metrics, Noori et al. classified the urban street functions and assessed the structural importance of streets [9].

Measures of network centrality are useful predictors of a number of interesting urban phenomena. There are numerous studies on the relationship between road network centrality and observed traffic flow. Jiang and Liu found that vehicle flow correlated better with the morphological property of streets than with axial lines [10]. Akbarzadeh et al. found that traffic volume correlates with node centrality. Nodes with high weighted degree and betweenness should be given higher priority to enhance connectivity and resilience in urban street systems [11]. Lai et al. applied taxi big data to construct a travel flow network to investigate the spatial interaction between different urban functional areas in Shenzhen, China [12]. In addition, there are studies that use network centrality to predict traffic flow. Yang et al. use urban network analysis (UNA) to objectively and accurately study the street vibrancy of a community and to predict the walking behavior of residents across age groups [13]. The proposed method combines quantitative and visualization analyses and predicts residents’ walking activities within a community based on the community living density and the distribution of roads and public space. Gao et al. predicted urban traffic flow based on the road network centrality indicator model for Qingdao, China [14]. Min et al. use topological distance in the road network to model the spatial relationship of traffic flow for traffic forecasting [15]. However, such distance-based models ignore the spatial heterogeneity of road traffic interactions. As the traffic flow on the road network is very heterogeneous in space, the spread of traffic between roads must be anisotropic [16, 17]. Therefore, it is necessary to measure traffic interactions between roads by considering their heterogeneity.

In summary, many studies have examined the relationship between road network centrality and social activities, facility allocation, and travel behaviors. In addition, there are studies that use network centrality analysis to predict traffic flow. However, the changing patterns of road network centrality and the relationship between road network centrality and residents’ activities at multiple scales have not yet been studied in detail. In accordance with the changed search radius, measuring the centrality of multiscale road networks helps to understand the hierarchical structure of global and local road networks, as well as spatial heterogeneity. Thus, this paper applies multiscale urban road network centrality indicators (closeness centrality, betweenness centrality, and straightness centrality) to investigate the changing spatial distribution pattern of road network centrality taking into account street heterogeneity. The paper also explores the relationships between network structures at different scales and residential trips with regard to the street environment. This is analyzed in conjunction with taxi pick-up and drop-off patterns. The results of this study may aid the configuration and optimization of the road network structure, as well as in the urban traffic management and planning.

2. Study Area and Dataset

2.1. Study Area

Shenzhen is situated in the southern part of Guangdong Province, which also constitutes the north of the Hong Kong Special Administrative Region (HKSAR) and forms the Pearl River Delta (PRD) with Guangzhou, Hong Kong, Macau, and several other cities. Due to its geographical advantages and political environment, Shenzhen has rapidly developed from a small fishing village to a regional central city over the past 40 years. The city has a total area of about 1991 km2 and a population of about 12,528,300 as measured at the end of 2017. Shenzhen has ten districts, namely, Luohu District, Futian District, Nanshan District, Yantian District, Baoan District, Longhua District, Longgang District, Pingshan District, Guangming District, and Dapeng District (Figure 1). It is considered a typical immigrant city, as the population size continues to grow. As a result, residents’ travel activities cause intense pressure on city traffic.

2.2. Dataset
2.2.1. Data Sources

This paper uses basic city information data and taxi GPS data collected in the week of October 17 to October 23, 2017. City basic information data includes Shenzhen Road network data, subway, and bus stop data (2017), building census data (2016), and POI data (2018). More specifically, the road network data are obtained from the government website, while the building census data comes from the Shenzhen Information Center. The POI data represents 583,620 interest points obtained from the Baidu search engine.

Taxi GPS data must be preprocessed. Processing steps include data format conversion, anomaly cleaning, and OD points extraction. First, the data in CSV format were converted into a shape file format (spatial point data) and the necessary attribute fields were saved. Second, data for anomalous points (outside the spatial range, outside the temporal range, or missed attributes) were deleted. Finally, the track point data are sorted by taxi ID numbers and time. After that, the points whose operation status are changed from “empty” to “heavy” are marked as boarding points and labeled as “1.” On the other hand, the points whose operation status have been changed from “heavy” to “empty” are marked as drop-off points and labeled as “0.” Taxis that had no change in their operating status or had a travel time of less than 1 min were deleted. After processing, 3,503,396 traveling data for 17,342 taxis in Shenzhen, China, were obtained. Among them were 2,421,156 pick-up and drop-off points on work days and 1,082,240 pick-up and drop-off points on weekends. These data include taxi ID (Car ID), longitude and latitude coordinates of the pick-up point (O-lon, O-lat), pick-up time (O-time), longitude and latitude coordinates of the drop-off point (D-lon, D-lat), and drop-off time (D-time) (Table 1).

2.2.2. Spatial Units

At the micro level, human activities, landscape features, and urban structure can be observed at the street level. This paper adopts street segments as spatial units of urban network measurement. Most commonly, the TOD is defined according to the understanding of how far people are willing to walk to take public transportation. In Asian cities, 500 m is considered an acceptable walking distance (Sung et al.,[18]). Thus, this paper takes the allocation space as an area with no more than 500 m of the Euclidean distance for each segment. Furthermore, Tyson’s polygons are applied as statistical units to calculate environmental indicators. ArcGIS extracts the road center lines of the Shenzhen road network and establishes a network dataset. There are a total of 35,590 street segments and 24,833 intersections in the network dataset (Figure 1). Finally, the paper uses a map matching algorithm to calculate the distance between points (pick-up and drop-off) and street segments in the network. The closest street segment is selected as a match for each point.

3. Methodology

3.1. Indicators

Table 2 shows the framework of indicators. In terms of the road network centrality and street surrounding environment attributes, a total of 9 indicators were selected to measure the spatial environment of the streets. Furthermore, residents’ travel behavior is measured by the frequency of taxi pick-ups and drop-offs on the street.

Three indicators of road network centrality (closeness centrality, betweenness centrality, and straightness centrality) assess street network accessibility, transit function, and convenience of direct access [6, 19, 20]. As for the indicators of the surrounding environment, they include distance from the metro station, distance from the bus station, number of metro and bus stations nearby, frequency of pick-up and drop-off in surrounding streets, POI density, and development intensity.

Closeness centrality (Cc) represents the inverse of the sum of the shortest path distance from a node to all other nodes. In other words, it is the relative accessibility of the node in the network. Within an urban transportation network, the closer the distance to the network center, the higher the proximity centrality of the node. This can be defined in the following formula:

As mentioned above in the formula, dij represents the shortest distance from node i to node j.

Betweenness centrality (Cb) represents the total number of shortest paths between all nodes passing through a given node, that is, it represents the transit and connectivity function of nodes within the network. Within the urban transportation network, the betweenness centrality increases as the traffic carrying capacity of the streets increases. Cb can be defined as follows:where δij stands for the total number of shortest paths from node i to node j. Furthermore, represents the number of these shortest paths that need to pass through node k.

Lastly, straightness centrality (Cs) is the deviation of the shortest path between two nodes from the straight path. Simply put, the less they deviate, the better the Cs will be. If one can transfer from one node to any other in the network by the shortest straight path, then, that node is considered to have the best straightness and the highest traffic efficiency. This indicator evaluates the structure of the urban transportation network and is defined as follows:

In the formula mentioned above, represents the Euclidean distance between node i and node j.

As the traffic flow on the road network is very heterogeneous in space, the spread of traffic between roads must be anisotropic. Therefore, considering the heterogeneity of the streets is necessary to model the traffic flow, which in this study is characterized by indicators of the street surrounding environment. The surrounding environment indicators include the remaining six items, whose distribution characteristics are illustrated by the map of Shenzhen in Figures 27. For example, there are a significant number of taxi trips around metro stations (Figure 2). Thus, the indicator distance from the metro station has an important impact on taxi trips. Second, the bus stations are evenly distributed (Figure 3). Figure 3 shows that the average distance of streets to bus stations is 166 m, with each street having an average of 6.82 metro or bus stations within 100 m. Furthermore, more than 50% of streets are less than 100 m away from the closest bus station, while 76% of streets are less than 200 m away from the closest bus station. Only 1,659 streets (4.66%) have a distance of more than 500 meters from the closest bus/metro station. Third, there are more bus stops around secondary and branch roads in the city than those on the main roads. This is represented in Figure 4, which illustrates the indicator number of metro and bus stations nearby. The traffic states of urban roads are often influenced by their neighboring roads [21]. Different terms, such as spatial dependence, traffic relationship, and spatial correlation, are used in the literature to express such a relationship between neighboring roads. The indicator named frequency of pick-ups and drop-offs in the surrounding streets is represented by the sum of the frequency of taxi pick-ups and drop-offs from the surrounding streets of the given street (Figure 5). This figure records that behavior in surrounding streets is observed to affect traveling behaviors on the street in question, exhibiting spatial autocorrelation. The indicator POI density reflects the vitality of the street. It is defined by the ratio of the number of POIs in the statistical unit to the length of the street (Figure 6). Lastly, the indicator development intensity represents the ratio of the total building area in the statistical unit to the length of the road. This indicator has been observed to have a significant impact on taxi trips. Namely, rapidly developing areas have more taxi traveling due to higher traveling demand generated by the high population density (Figure 7).

3.2. Model
3.2.1. Multiple Centrality Analysis

The analysis in this paper is based on the multiple centrality analysis (MCA) model. The MCA aimed at a spatial analysis of centralities in urban networks constituted by streets as links or “edges” and intersections as “nodes.” It is conducted with the application of the urban network analysis (UNA) (UNA represents an urban network analysis tool developed by the Singapore University of Technology, designed in collaboration with MIT. It is based on the ArcGIS software platform, which enables the measurement of traffic network centrality), which measures the centrality of the road network in Shenzhen. UNA is based on the ArcGIS software, which is applied to a spatial network so that all distances are routed along the networks and network-based distance accounts for network structure. UNA builds spatial connections based on the traffic network, which is closer to people’s perception of the real living environment.

In our study, we choose the UNA tool mainly because it integrates the MAC model, which can help us calculate and visualize multiscale network centrality. In addition, it has been used widely in recent years. Some existing studies have applied UNA tools to simulate the best location and scale of retail centers by predicting the passenger flow of planned retail centers in cities [22], or to predict the location and layout of stations around communities by simulating travel behavior [23, 24].

3.2.2. Linear Regression Model

In the study of urban areas, geographic data influence spatial interactions. Linear regression analysis quantifies the intensity of correlation between a dependent variable and multiple independent variables. In this study, a multiscale linear regression model is based on the frequency of taxi pick-up and drop-off, as well as on the nine aforementioned indicators. In addition, SPSS software performs regression analysis to investigate the centrality of multiscale road networks and the statistical characteristics of residents’ travel behaviors.

4. Empirical Study

4.1. Descriptive Statistics
4.1.1. Temporal Characteristics of Taxi Travel

Analysis of 24-hour taxi travel time reveals a decreasing trend from 0:00 to 5:00. The lowest frequency is between 5:00 and 6:00. Furthermore, there is a significant increasing trend observed from 6:00 to 9:00. In sum, taxi travels appear to have three typical peaks occurring in the morning (8:00 to 10:00), afternoon (12:00 to 15:00), and evening (19:00 to 22:00). These peaks are illustrated in Figure 8. The average travel time of a single taxi trip was measured to be 10.5 minutes.

Furthermore, our analysis shows that the overall travel frequency on weekends is higher than on work days. During daytime hours on work days, the number of trips increases continuously from 7:00 to 16:00, with a “low peak” occurring at 18:00. The work days’ evening peak appears later than on weekends (Figure 9).

4.1.2. Spatial Distribution Characteristics of Taxi Travel

The frequency of taxi pick-ups and drop-offs reflects residents’ demand for taxis. By analyzing their spatial distribution pattern over a week, this paper found that the activities occurring in street units have significant heterogeneity (Figure 10). The high frequency of activities is concentrated mainly in large transportation terminals, urban centers, major public service facilities, leisure and entertainment sites, and tourist attractions. As Figure 10 illustrates, the taxi pick-up and drop-off hotspots are primarily located in Luohu center, Futian center, and Nanshan center. They also exhibit a clear decreasing distribution from south to north. Taxi hotspot areas include neighborhoods located around major public service facilities (Shenzhen Hospital of Traditional Chinese Medicine, Shenzhen Second People’s Hospital, Shenzhen Library, Concert Hall, and Museum) and important tourist attractions (Window of the World and Dameisha Baths), as well as large residential areas. In addition, the airport, terminals, high-speed rail stations, train stations, bus stations, and other major transportation hubs also exhibit an increased level of taxi activities.

4.2. Road Network Centrality

The spatial heterogeneity of multiscale road network centrality helps to identify the response characteristics of road network centrality to scale. In addition, spatial heterogeneity determines at which scale its relationship is most harmonious with the spatial distribution of the dynamic flow of taxis. We use actual trajectory distances instead of the Euclidean distances between origin and destination as the length of the trip because people are sensitive to the relatively-expensive price of taxi trips, which correlates with the actual trajectory distance. According to the analysis of the taxi travel distribution in Shenzhen for a week, 50%, 85%, and 95% of the travel distance were within 2844 m, 7945 m, and 14537 m, respectively. This paper selects the centrality of four scales (global, 3 km, 8 km, and 15 km radius).

4.2.1. Road Global Centrality

Figure 11 illustrates the distribution of the three road global centrality indicators in Shenzhen. Closeness centrality characteristics indicate that the value of the global closeness centrality is distributed around the city center. Furthermore, it shows a negative correlation with its distance to the center (Figure 11(a)). Therefore, high value areas are concentrated primarily within the central axis of Shenzhen. The distribution centers of the global closeness centrality roughly match the streets’ network center. In other words, the contribution of these streets to the traffic network in terms of their accessibility is significantly greater than previously thought.

Areas with high betweenness centrality values are distributed mainly on urban trunk roads (Binhe Avenue, Shennan Avenue, and Beihuan Avenue) and highways (Nanping Highways, Shen-Hai Highways, and Nanguang Highways). This distribution is illustrated in Figure 11(b). In urban planning, betweenness centrality analysis can help to increase the efficiency of urban road networks and infrastructure.

Lastly, the areas that have high values of straightness centrality are primarily located in Luohu District, Futian District, Nanshan District, and Baoan District. The straightness centrality value is significantly higher in the south of Shenzhen than in the north. Furthermore, it is higher in streets that run in the east-west direction than in the north-south direction (Figure 11(c)). Areas with a high value of straightness centrality have high traffic efficiency, thus benefiting residents’ travel.

4.2.2. Road Local Centrality

The centrality value in this study was calculated by examining all nodes in the network with a given radius. In part, these calculations are also the results of the spatial form of the city, which is gradually emerging as a polycentric pattern. Shenzhen is a city experiencing both population growth and economic development, in which development of construction land is discontinuous due to ecological control lines. As the selected radius is shorter, the number of nodes spanning from one node will be smaller, making the structural characteristics of the local road network clearer. Conversely, as the search radius increases, the scope of the local road network also gradually increases, revealing the spatial heterogeneity of the road structure.

This study selected a series of search radii including 3 km, 8 km, and 15 km. They are used to measure the local network centrality of Shenzhen (Figure 12). As the selected search radius increases, areas with high closeness centrality values transform from a scattered multicentered distribution to single-centered distribution. Furthermore, the spatial pattern of betweenness centrality varies across different search radii. In other words, as the selected search radius increases, areas with high betweenness centrality values first represent a decentralized distribution. After some time, they accumulate on either side of the main roads along the east-west direction. Lastly, there are also a number of spatial pattern changes regarding the straightness centrality. Namely, straightness centrality moves away from decentralized distribution towards accumulation within the southern central area of the city. This accumulation is predominantly concentrated in districts (Luohu, Futian, Nanshan, and Baoan) through main city roads (Binhai Avenue, Shennan Avenue, Beihuan Avenue). Here, Beihuan Avenue exhibits the highest straightness centrality values.

4.3. Regression Analysis

Regression analysis is crucial for understanding the relationship between multiscale road network centrality and the spatial distribution characteristics of taxi pick-ups and drop-offs. In this study, the regression analysis takes the taxi pick-up and drop-off frequency during the week (from October 17 to October 23, 2017) as the dependent variable, and the road network centrality and the surrounding environment as the independent variables. According to the analysis of the distribution of taxi travels in Shenzhen for one week, 50%, 85%, and 95% of the travel distance are 2844 m, 7945 m, and 14537 m, respectively, so the analysis is applied in four search radiuses (3 km, 8 km, 15 km, and global).

The selection of independent variables in multiple linear regression is very important. In order to establish the best equation, the independent variables with strong influence on the dependent variable should be selected into the regression equation as much as possible, while the independent variables with weak influence on the dependent variable should be excluded as much as possible. We analyzed the correlation matrix of the multiscale (3 km, 8 km, 15 km, and global) independent variable indicators in Table 3. Most of the correlation coefficients of the indicators are under 0.6. It is worth noting that the absolute values of the correlation coefficients of two pairs of indicators are between 0.6 and 0.8. The correlation coefficients of Cs (8 km) and DMS, Cs (15 km) and DMS are −0.641 and −0.707. Their correlations were more significant. Then, we used stepwise regression methods for further indicator selections that none of the independent variables outside the model are statistically significant, while all variables within the model are statistically significant (Table 4).

The R2 value at each scale is 0.584 (3 km), 0.579 (8 km), 0.577 (15 km), and 0.573 (global), respectively (Table 4). All the values of the significance test are 0.000. The VIF values of all the variables are below 3 indicating no excessive multicollinearity [25]. We know that a VIF of 5 and above is not good for regression model because it might render other significant variables redundant [26]. The result shows that the regression models can explain the frequency of taxi pick-up and drop-off at a good fitness, and the explanatory variables in the models can well explain the taxi travel behavior. According to the taxi trajectory distances analysis, a majority of the taxi trips are short in Shenzhen (half of the trips are shorter than 3 km and 85% of the trips are shorter than 8 km) which is different from Shanghai [27]. The R2 of the model varies at different scales. From the search radii of 3 km search to the Shenzhen global scale, R2 gradually decreases as the search radius increases. Therefore, it is useful to study the typical travel distance for a specific city as well as to develop a multiscale travel behavior model.

The regression analysis shows that both the road network centrality (which varies by scale) and the surrounding environment correlate differently with taxi pick-ups and drop-offs. The significant moderating effects of the model are presented in Table 4. The significance of the models is analyzed at three confidence levels of 99%, 95%, and 90%. The regression results show that the frequency of taxi trips at each scale is significantly positively correlated with the betweenness centrality (). Furthermore, the higher the transit and connectivity function of nodes within the network, the higher the frequency of taxi pick-up and drop-off will be. In comparison, closeness centrality does not correlate with taxi trips at all scales. On the other hand, straightness centrality has a negative correlation with the frequency of taxi pick-up and drop-off at 3 km and 8 km scale ().

The indicators of the street surroundings can reflect the travel demand. The further away from the metro and bus stations, the more demand there is for taxis. Areas with high density of public transport stations tend to be more concentrated, and the demand for cab trips will also be increased. Street surrounding environment indicators (FSS, PD, and DI) are positively related to higher taxi pick-up and drop-off frequency (), which is largely consistent with previous research [21, 28].

5. Conclusion

As centrality is crucial for understanding the structural characteristics of dense traffic networks, the indicators of road network centrality are of vital importance in urban planning and transportation management. This paper has analyzed the temporal and spatial distribution characteristics of taxi pick-ups and drop-offs in Shenzhen, China. Taking streets as units of study, road network analysis examined the centrality of multiscale road networks. After that, the paper modeled the frequency of taxi pick-ups and drop-offs with multiscale network centrality. Several conclusions were obtained from these analyses.

Firstly, peak time is correlated with commuters’ characteristics. Residents’ average travel time for one trip is 10.5 min, indicating that a considerable portion of the trip is done for commuting. In addition, the differences in taxi travel time between work days and weekends show residents’ demand for leisure and entertainment activities during the weekend. According to the analysis of the taxi travel distribution in Shenzhen for a week, 50%, 85%, and 95% of the travel distance were within 2844 m, 7945 m, and 14537 m, respectively. The spatial distribution of taxi travel in Shenzhen is centrally distributed around large transportation terminals, urban centers, major public service facilities, leisure and entertainment sites, and tourist attractions. Specifically, taxi pick-ups and drop-offs are concentrated in Luohu center, Futian center, and Nanshan center. They also show a clear decreasing distribution from south to north.

Secondly, as the search radius increases, the three road network centrality indicators exhibit varying degrees of concentration. Firstly, the closeness centrality exhibits the multicenter accumulation distribution, and then, gradually transforms into a single distribution. Secondly, areas of high values in betweenness centrality initially show a decentralized distribution, only to later accumulate along the east-west main roads. Lastly, the straightness centrality is transformed from decentralized distribution to accumulation in the southern central part of the city.

Lastly, the regression analysis shows that both the road network centrality (which varies by scale) and the surrounding environment correlate differently with taxi pick-ups and drop-offs. The frequency of taxi trips significantly positively correlates with betweenness centrality (). Closeness centrality does not correlate with taxi trips at all scales. On the other hand, straightness centrality has a negative correlation with the frequency of taxi pick-up and drop-off at 3 km and 8 km scale (). As the search radius increases, R2 gradually decreases.

Future research on urban network centrality should continue to explore its relationship with taxi trajectories as well as other residents’ travel behavior. This type of research will help to make effective suggestions for the road network optimization and the allocation of relevant facilities.

Data Availability

This paper uses basic city information data and taxi GPS data gathered in the week from October 17 to October 23, 2017. The basic city information data includes Shenzhen Road network data, subway, and bus stop data (2017); building census data (2016); and POI data (2018). More specifically, road network data were obtained from the government website, while the building census data comes from the Shenzhen Information Center. The POI data represents 583,620 POI interest points obtained from the search engine Baidu. The data used in this research are available upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant nos. 41830645 and 41571145) and the Shenzhen Science and Technology Innovation Commission (GXWD20201230155427003- 20200822000944001).