Abstract

The intelligent sensing and communication technology in the airports’ grid information system provides a multidimensional big data set for analyzing flight delays. These data from air traffic control, weather, and multiple determinants will cause initial flight delays. Due to the influence of adjacent flight time correlation, the initial delay causes the delay of subsequent flights, discovered by mining information sensing data, forming the phenomenon of flight delay diffusion. Different determinants will lead to the delay diffusion form of different regions, and more seriously, it will lead to “disaster area” delay in the whole regional grid information structure. To analyze the spatial impact of each factor on flight delay and explore the regional distribution of delay determinants, this paper combined the spatial regression model and determined the key explanatory variables by statistical and processing of the aviation system data. The case study showed the spatial airport delay characteristics in terms of aircraft movements in China. After processing intelligent sensing and communication data, the results show that there is a spatial effect between airports in terms of delay and determinants. The high-delay clusters of delay constraints principally occurred in the Beijing-Tianjin-Hebe and Yangtze River Delta urban agglomerations. Direct flights, weather, new flight routes, take-off, and landing capacity have a more critical impact on spatial airport delays. The use of Internet of Things technology to perceive, analyze, and integrate multiple information of airport delay and combine spatial analysis models can accurately mine delay characteristics and effectively achieve digital and intelligent flight delay management.

1. Introduction

Intelligent flight sensing and communication data can extract multidimensional data about aircraft flight trajectories, weather, airport operations, and air capacity. Based on these data, systematic analysis of the delay information in the airport grid is the key to reducing the direct loss of passengers and carriers and promoting the economic development of the civil aviation region. Delayed spatial grid analysis combines the idea of the Internet of Things (IoT) network technology with the current research trend. This is achieved by integrating passenger mobile data, aircraft operation information, meteorological distribution information, and airport status information. Based on the analysis of flight information, Chen et al. calculated the indirect economic impact of flight delays on the Chinese economy and concluded that the total indirect influence in 2013 was 355.71 billion RMB, which also stressed the importance of controlling flight delays [1]. Except for China, the Federal Aviation Administration (FAA) has proposed that the increase in flight delays endowed tremendous pressure on the US air travel system with billions of dollars loss of airlines, passengers, and society annually. In 2007, the economic losses caused by airlines amounted to 8.3 billion dollars, and the losses included increased staff, fuel, and maintenance costs [2, 3]. Also, Air Traffic Flow Management (ATFM) estimated the total cost of delays in Europe (including all causes and reaction costs) to be 1.15 billion euros in 2011. The average delay cost for delayed flights has already reached 1,660 euros [4]. Flight delays have brought a significant impact on the global economy, emphasizing the importance of controlling flight delays.

Except for the enormous economic cost loss, the direct consequence of flight delays has reduced the on-time performance, which is also a widespread concern in the civil aviation industry. In 2017, 2.895 million China’s passenger airlines were on time (4.039 million flights in total), and the average flight rate was 71.67%. Several primary reasons for the flight delay include air carrier problems, extreme weather, and air traffic control [5, 6]. Beyond that, spatial correlation can also lead to the propagation effects of airport delays. By using machine learning to simulate the aircraft operation, delays in the flight of one aircraft can affect the subsequent flight, which ultimately leads to delays in the propagation of airport pairs [7, 8]. By integrating the sensor communication information among the airports, routes, and aircraft, the airport delays have formed an extensive delay grid information system. How to accurately analyze the spatial relationship between multiple airports in the delay grid and lucubrate their determinants has become significant and challenging issues for civil aviation delay.

At present, the combination of new technologies such as the Internet of Things, big data, artificial intelligence, and 5G communication with airport operation management has begun to receive attention [911]. Among them, the intelligent analysis of delay management and control is also one of the research contents using the new scientific and technological revolution and industrial reform. Analyzing and integrating various information and element resources of a flight delay can help finally realize digital and intelligent decision-making on delay prevention, control, and mitigation. It is the extensive application and deep integration of new-generation technologies such as the Internet of Things, big data, and artificial intelligence in delay analysis. In addition to intelligent data mining, an effective analysis model is also the key to studying the distribution characteristics of delay in the airport grid.

About correlation between delay information system and spatial airport grid distribution, Hansen and Hsiao used the econometric model to examine the daily mean of 32 airports’ take-off delays in the United States from a time dimension. The trend effects, including aircraft queues, flight schedule, and meteorological conditions, are statistically analyzed [12]. They found that the increase in total flight and operation demand would aggravate airport delays in the airport grid distribution. The delay effect of the destination and route weather counted on the number of flights. Zou et al. conducted a comprehensive empirical analysis on the impact between flight delays and flight frequencies in the US air transport system [13]. The results showed that flight frequency had a positive impact on flight delays. Duran-Fernandez and Santos found four critical variables that can explain delays in European airports (market concentration, coordination, hub airports, and hub airlines) [14]. In Europe, although the flight delay at the hub airport was higher than that of the nonhub airport, the flight delay of the hub airline was lower than that of the nonhub airline, which explained why the spoke-type hub system in Europe was not comprehensive, and the degree of control about the take-off and landing of airports was incomplete. Lall first attempted to use the count regression model to investigate delays and delay determinants among the three airports in New York City [15]. The Poisson regression model and the least-squares regression model were used to analyze the influencing factors of New York airport delay, while severe weather had the most significant impact on expected delays.

Since the relevant studies are interdisciplinary, scholars have used various parameters or nonparametric methods in their research. However, the study on the airport delays from the spatial grid is scarce, fragmentary, and unmethodical. Therefore, the contributions of this paper include (1) exploring the spatial grid pattern of flight delay at the city level; (2) evaluating the comprehensive spatial autocorrelation of delays between airport grids; and (3) quantitatively identifying the geographical distribution characteristics of each delay determinant and calculating its impact degree by processing the flight sensing and communication data. This paper uses spatial regression models to analyze the correlation and determinants of delays among multiple airports, and the results can provide a reference for the focus of delay prevention and control in different regions in air traffic management.

The results and multivariables statistics methods are reported in Section 2, which also show the relevant explanatory variables in this analysis. Section 3 presents the method of modeling. Section 4 discusses the methodology, and Section 5 provides conclusions and policy recommendations.

2. Multivariables Determination by Big Data Mining

The Internet of Things and big data application needs to be implemented based on the various flight delay activities and operations. Figure 1 shows the elements of the airport grid operation and correlation. According to these elements, we propose an airport grid of delay element framework based on the Internet of Things technology, which divides into three categories (operation control, aviation meteorology, and collaborative interaction). In Figure 2, the three categories include flight plans, aircraft track, aircraft performance, operation rules, flow control data, meteorological data, airport collaboration, company collaboration, passenger collaboration, and other determinants.

The intelligent sensing and communication data of airport delay IoT structure processed were from the Civil Aviation Administration of China, the Statistics Bureau, and the Beijing Capital International Airport database with route conditions and corresponding weather information system data. However, the weather database is an hourly record, and the flight database is not always consistent with other items. Therefore, in data processing, the database is divided into route (including real time flight route monitoring, spatial positioning and tracking, passenger movement, and other data) and weather type. According to the 2017 Civil Aviation Development Statistics Bulletin issued by the Civil Aviation Administration of China, there are ten major airlines i.e., Guide Air, Air China, China Eastern Airlines, Hainan, Shenzhen, Sichuan, Xiamen, Shandong, Shanghai, and Tianjin Airlines. The executed flight volume accounted for 78% of the total flight volume, but its average flight on-time performance was 66.9%, which did not reach the average standard rate of all flights in China. The top 100 airports have almost covered the whole major airlines and routes under the comprehensive Chinese delay system. Therefore, the data are driven from top 100 traffic volume and processed airports from January 1, 2017, to December 31, 2017, including airport flight conditions, aircraft type, delay time, passenger load factor, and weather at the corresponding time. The data of the collection are as follows.

2.1. Dependent Variable from IoT Structure of Airport Delay

In the delay information system, the dependent variable is the average delay time for 15 minutes or more relative to the scheduled arrival/departure times. The Civil Aviation Administration of China has demonstrated the delays of airlines by the delaying of flights. In 2016, according to the draft “Statistical Measures for the Regular Flight of Civil Aviation (Consultation Draft),” the on-time flight referred to the flight [16] that arrived at least 15 minutes (inclusive 15 minutes) after the scheduled arrival time. The standard turn time was set based on the airport passenger throughput published by the Civil Aviation Authority in the previous years, stipulating the maximum time from the withdrawal of the airport to take-off. Table 1 contains the standard airport turn time. Therefore, during the sample period of the study, the calculation formula (1) and formula (2) for the delay time of each flight in each airport are as follows:

The total average time is as follows:where , indicates the arrival delay time of flight j in airport i, means the actual arrival time, represents the estimated arrival time, , symbolizes the departure delay time of flight j in airport i, intimates the actual departure time, is the estimated departure time, and indicates the standard turn time. implies the average delay time of airport i, is the total number of flights arriving, and signifies the total number of flights departing.

2.2. Independent Variable from IoT Structure of Airport Delay

The concept of the IoT is to connect any object with the network. Objects exchange and communicate information through information dissemination media to achieve intelligent identification, positioning, tracking, supervision, and other functions. Bringing the idea of the IoT into the analysis of flight delay can help research on mining information data related to delay from the systems perspective and as independent variables. Independent variables include average passenger throughput for each shift and average take-off and landing times for all aircraft at the airport each day, which are from the flight database of the Civil Aviation Administration, intelligent airport sensing, communication datasets, and summarized by shifts and hours.

Due to the inconsistent minimum spacing required between aircraft pairs, aircraft take-off and landing can affect airport capacity, resulting in flight delays, especially when instrument conditions are in effect [1719] through aircraft positioning data. Aircraft types will also affect airport delays, especially in heavy-duty operations where mixed take-off (landing) has the most significant adverse impact.

Considering the capacity and airspace constraints in terms of airspace, the number of flights can reflect the congestion of the airspace. Duestablished delay causality grid (DCG) based on the Granger causality test and determined the airports associated with the delayed propagation links of the airports [20]. Figure 3 shows the directed grid which includes building DCGs and counting the number of flights between the two airports to reflect the number of airspace congestion routes at each airport.

In terms of airlines, through flight and passenger tracking data, considering the actual average capacity, the number of direct flights from each airport and the number of new direct flights to and from the port are independent variables. The increased demand (obtained through the movement trajectory of passengers’ mobile information) has imposed the busy degree of airports and airlines, especially the hub airports. The average airport capacity will directly affect the air traffic congestion, thus affecting the take-off and landing time of aircraft, so it is necessary to consider the number of direct flights from each airport since direct flights are the decisive factor that can directly lead to the delay of the next flight. At the same time, the continued high increasing for new air routes has complicated the crowded airspace structure.

The weather system is an indispensable factor in the “IoT network” of airport delays. Lousy weather conditions can lead airport dysfunction and cause delays in almost all operation phases. Besides, due to adverse weather events, airport visibility will be reduced, resulting in large-scale airport delays. Previous studies analyzed the visibility in detail and collected various weather elements that affect airport visibility [2022]. In the delay information system, regarding the daily weather conditions of each airport meteorological bureau (such as the visibility affecting aircraft take-off and landing), he sorted out the weather conditions of re-air flight and ground take-off and determined the factors that would affect the delay. All subweather variables would integrate into a total weather variable. The counting method is as shown in formula as follows:

According to the meteorological radar big data from airport grid IoT structure, the conditions for weather selection are as follows [13, 25]:(i)If there is a severe thunderstorm reported within 50 miles of the airport, the indicator variable will take a value of 1; otherwise, it is zero.(ii)In the route, there are moderate and heavy road thunderstorms with a value of 1, otherwise zero.(iii)The airport has heavy snow (24 hour snowfall between 5.0 and 10 mm) or blizzard (24 hour snowfall above 10 mm) with a value of 1, otherwise zero.(iv)There is heavy rain at the airport (precipitation with a rainfall of more than 16 mm per hour, or a continuous rainfall of more than 30 mm for 12 hours, or precipitation with a rainfall of more than 50 mm for 24 hours). The value is 1; otherwise, it is zero.(v)Strong winds will appear at the airport (is above level 4) with a value of 1, otherwise zero.(vi)Haze weather at the airport (greater than 80%) has a value of 1, otherwise zero.(vii)When the airport cloud level is lower than the lowest decision height (10 meters) of the instrument landing level, the value is 1; otherwise, it is zero.(viii)The sandstorm storm at the airport (less than 1 km) takes the value 1, otherwise zero.

With the above research on the relationship between airport delays and related factors, the relevant variables of nine airport delays have been obtained based on intelligent sensing and communication data collation and statistics (as shown in Table 2).

3. Methodology

Spatial effects and autocorrelation tests must be carried out firstly on critical variable data before modeling in the delay information system. If a spatial effect exists, a spatial regression model will be further constructed to reach estimated measurement.

3.1. Spatial Correlation between Airport IoT and Other Independent Variables

Before studying the spatial correlation between airport IoT and other independent variables, it is necessary to determine the spatial correlation of delay between the airport grid. In order to detect the spatial relationship between delays, it is necessary to carry out a Moran I index test among multiple airport pairs for the average concentration of airport delays in spatial units and test the similarity, difference, or independence of airport delays across China.

Moran’s I is ranging from −1 to 1, where if the value > 0, it indicates a positive correlation, as a high (or low) delay airport is adjacent to a high (or low) delay airport. Value < 0 indicates a spatial negative variable correlation, indicating that a high-delay airport is adjacent to a low-delay airport; when the index is equal to 0, there is no spatial relationship between airports. The higher the spatial correlation between airports, the greater the absolute value of the index will be. Equations (5)–(7) is as follows:where I represents Moran’s I, n is the number of airports in a Geospatial space, and are the delay values of airport i and airport j, respectively, is an element of the space weight matrix, and is the average of all observations for an attribute feature, x, in n study areas.

On the other hand, in the airport grid, the phenomenon of delay accumulation between local airports is measured by the Moran scatter plots and LISA aggregate plots, which can reflect the degree of association between an airport and its neighboring airports. The Moran scatter plot is calculated as follows:

The LISA aggregation graph enables visualization operations to be performed directly on the map by delaying aggregation. It has pronounced and intuitively showed the spatial distribution of delays situation in the located area.

3.2. Spatial Weight Matrix by Route Tracking Position Data

According to route track positioning data of aircraft, different from spatial neighboring weight matrices, airports cannot directly determine whether they are contiguous. The distance has a direct impact on the delays between the departure and arrival airports [23]. Therefore, the distance matrix can select as the spatial weight matrix. The formula is as follows:

After reciprocal distance processing, geographic distance matrix is established and standardized processing is completed as formula (10) and formula (11):

The space distance matrix of airport delay includes a set of airport pairs (that is, two-dimensional array). Therefore, given N airports in the Euclidean space, the distance matrix is a symmetric N ∗ N matrix with nonnegative real numbers as elements. In the spatial delay analysis, the greater [24] the distance is, the smaller the impact of airports on the delay will be.

3.3. Modeling

In order to analyze the correlation of the key factors in the airport delay Internet of Things, we established a spatial regression model for analysis. Spatial econometric regression models can be achieved in many forms [26], while spatial lag models and spatial error models are commonly used types. When airport delays have spatial grid effects, it is necessary to establish a feature delay model that includes spatial relationships. Then, the spatial lag model or the spatial error model can be applied based on spatial autocorrelation and spatial heterogeneity.

The flight delay impact function can be received based on the analysis on the determinants of airport delays. The following equation is the basic quantitative regression model:

After determining the spatial correlation, in the analysis of the delay characteristics, except for the estimation of the OLS estimation, it is necessary to consider the spatial regression model of spatial effects. Therefore, with the base of the traditional regression model, extra spatial weight matrix can be adopted. The spatial Dubin model can be transformed to a spatial lag model (SLM) and a spatial error model (SEM) by setting constraints (when ), so as to establish the only spatial Durbin model, and the model can obtained as the following equation:

The above model can also be displayed as the following equation:where is the explanatory variable, is the spatial lag term of the explanatory variable, is the spatial autoregressive coefficient, , , , , , , and are the explanatory variables, , , , , , , and are the spatial lag term of the explanatory variables, is the 100 × 100-order spatial distance matrix, is the explanatory variable space lag term coefficient, is the regression coefficient,  = 1, 2, 3, 4, 5, 6, 7, 8, means a spatial effect, means time effect, is random error term, is spatial autocorrelation coefficient of error term, and is error term of independently and identically distribution.

4. Results

According to the data of delayed IoT mining and combined with the spatial model, the results in Table 3 show that the goodness of fit R2 for the basic model is 43.6%, and the adjusted R2 is 40%. F value is 8.8. The model passes the 1% level of the significance test, reaching obvious DFC and PTD on level 0.01, NHA, NHA, and WEA on level 0.05, while AAC, PTD, ATL, and ACR have passed a 10% level of the significance test.

As for the spatial correlation with different factors among the airports, four levels can describe the interspecific association, including high, higher, lower, and low. By applying regression analysis to all relevant determinants in the airport IoT network considering the average delay time, the spatial distribution of airport delay determinants can maintain the pattern of “high delay in the East and South area” and “low delay in the north and west area.” Only a few airports have a different distribution.

About airport delay IoT grid distribution, Figure 4(a) shows that high-high agglomeration airports in North China have PEK, NAY, and TSN. The Yangtze River Delta region is an another high-high agglomeration area for flight delays, mainly including SHA, PVG, NKG, CZX, HGH, and NGB. The high-low clusters are mainly distributed between CTU, CAN, KMG, CKG, XMN, SYX and LJG, LZH, WXN, TCZ, while HTN, HLH, YIN, HET, and other airports are low-low clusters.

Figures 4(b)4(i) show the delay factor distribution from AAC to WEA under the intelligent sensing and wireless communication data analysis. With the delay aggregation graph, the influence degree on each determinant of delay on the regional distribution is discrepant, except the hardest-hit areas located in Beijing, Shanghai, and Guangzhou. On the other hand, it is worth noting that critical delays have also occurred in Xizang. Combined with the correlation analysis of delay factors, the results present that the influencing factors do not play a decisive role in Xizang delay. Therefore, besides the influencing factors, the delay in Xizang may also be caused by airport operation failures.

For the determinants in the IoT structure of airport delay, the AAC aggregation map can express the distribution of airport flow and capacity delays. Traffic can delay distribution, concentrated in the Shanghai Pudong Airport and Shanghai Hongqiao Airport, as a high-high agglomeration area in the AAC variable. Beijing Capital International Airport, Guangzhou Baiyun Airport, and Shanghai’s two airports have a high-high cluster (means the delays situation of airport and its nearby airports in the surrounding areas are both serious.) at the DFC, NDF, and PTD airports. Except for the metropolises and first-tier cities, airports worthy of attention include Yunnan Changshui airport, Inner Mongolia Baita airport, and Sanya Phoenix Airport. As a tourist city with high demand throughout the year, they have a high degree of agglomeration in NHA, ACR, WEA, and PTD.

Moran’s I that calculated with the GeoDa software has displayed the 0.67 airport delay, indicating that the airport delay is not entirely random in spatial distribution with specific spatial correlation. Figure 5 is a scatter plot of Moran’s I. This shows that the spatial big data mined from each flight path in the airport IoT has spatial correlation in the airport delay grid.

Table 4 shows the regression results for OLS, SLM, and SEM. Comparing the basic model and the spatial model, the Log L values of SLM and SEM are more significant, the AIC value and the SC value are smaller than OAS’s AIC and SC, the spatial model is better than the basic model, and the fitting effect is also better than OLS. Therefore, the traditional regression model may have specific limitations in analyzing delay, which also implies the necessity of the spatial regression model. On the other hand, comparing SLM and SEM, it is found that the SLM has a more substantial Log L value, a more significant LR value, a smaller AIC value and an SC value, and a better SLM estimation effect. The result shows that the delay between China’s airports has a strong proximity effect, while the spatial heterogeneity of delays (errors) is relatively weak. In the SLM model, the airport delay has significant spatial effects with spatial correlation coefficient  = 0.9863, indicating that 100 airport delays have an extreme spatial dependence under the proximity effect in the airport grid.

As aforementioned, it is vital to provide some policy suggestions to reduce the aiport delay, as shown by conducting statitical analysis to the relevant intelligent flight sensing and communication data and combining the correlation degree of each determinant to airport delay in the spatial dimension. Due to the different scales, systems, and natural conditions, the policy should consider the diversity of the environmental conditions among the airports especially in the rapid development stage of IoT technology. The number of direct flights and new flights has a more significant impact on airport delays. However, in recent years, the quantity demand and planning of new airports and flight routes would increase by verging to multifold levels. The overall optimization can fundamentally reduce airport flight delays and control the economic loss while meeting the transportation demand.

Take-off and landing conditions in the delay IoT system are also important factors that affect the flight delay. Airport congestion mainly comes from aircraft operation flow and flight capacity. Combined with the spatial distance, taking the delay impact of regional influencing factors into account as airport take-off and landing queue structure adjustment and evacuation may bring unexpected benefits to solve the delay issues. In the airport IoT network, flight delay also needs to consider a transmission delay. The congestion of airspace capacity would also affect the operational structure of the airport. On the other hand, although the passenger throughput and the allocation of heavy aircraft will affect the delay of the airport, they are negatively related to the delay in the spatial. Therefore, when considering the allocation of passenger flow and aircraft types, airlines only need to consider its impact on the delay of a single airport.

Although the occurrence of flight delay is irresistible and the recovery of delay is uncontrollable when special weather conditions occur, the weather information system is a decisive factor with particular regional characteristics. According to the agglomeration distribution, in this case, Sanya, Haikou, and Shenzhen are cities which greatly affected by the weather. At the same time, due to the interaction of spatial distance between cities, marine climate characteristics, and the weather among airports, there is a strong correlation between the three airports in weather-induced delays. The weather conditions shall also be considered when the airport conducts a route schedule. However, from the overall point of view, compared with other determinants, there are few and concentrated areas strongly affected by weather factors. So, it is relatively easy to improve the delay in the high concentration area caused by weather factors.

5. Conclusion

The Internet of Things, big data, artificial intelligence, 5G, and other new technologies are still in continuous improvement. With these technologies, flight delay has formed a framework of delay mining technology systems in smart airport grids with advanced technology, open data fusion, security, and reliability. After the data set of airports, aircraft, flights, and passengers is collected through the IoT technology, this paper firstly utilized spatial autocorrelation to process multivariables-delay determinants (using intelligent flight location and communication data) and analyzed the spatial distribution characteristics of the airport delays grid in China. Basic regression models and geographically weighted regression models were then used to study the driving factors and regional differences in an airport delays information system. The results demonstrated the validity of the spatial econometric regression model. Secondly, the spatial aggregation characteristics of China’s airport delays were high aggregation and low accumulation, while DFC, NDF, and WEA positively correlated with the delay time. At the same time, the results of the geographically weighted regression model revealed that different spatial differences between multiple factors would lead to diverse effects of civil aviation delays. The spatial regression model can more systematically and intuitively understand various determinants in the airport system in different regions. This method can also be applied to other relevant studies, for example, time-space analysis of delay factors, or apply the findings to delays in the study of systemic grid structure propagation and delay assessment.

The delay space analysis of the aircraft execution process is an effective measure to ensure the operation of flights. The results of delay distribution in the airport grid are directly related to the adjustment and control of flight operations. Based on the analysis of multiple determinants, big data and multiattribute data excavated by the Internet of Things application development techniques are used to conduct spatial assessment modeling. From the perspective of flight trajectory, weather, and passenger demand learning, it is significant to explore the spatial distribution level, delay diffusion level, and determinants of delay during flight execution. Furthermore, the results of spatial delay analysis can help the airport IoT technology to reallocate flight operation support equipment and facilities according to the distribution of abnormal areas involved in a flight delay.

Data Availability

The related data used to support the findings of this study were supplied by Xiushan Jiang under license and so cannot be made freely available. Requests for access to these data should be made to Xiushan Jiang, xshjiang@bjtu.edu.cn.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by the Funds of the National Natural Science Foundation of China (U2034208), the Key Technologies of Digital management and Optimization of Freight Train Marshalling Plan (N2021X021), the Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Ministry of Transport, Beijing Jiaotong University, China.