Abstract
Urban regional risk is a complex nonlinear problem that encounters insufficient information, randomness, and uncertainty. To accurately assess the overall urban risk, a regional risk assessment model for urban public safety was proposed by using the information diffusion theory. The entropy theory was employed to optimize the information diffusion model to reduce the uncertainty. A framework of urban regional risk assessment model based on information diffusion and entropy was constructed. Finally, a case study of Hangzhou city in China was presented to demonstrate the performance of the proposed method. Results showed that the proposed method could successfully estimate the urban regional risk of Hangzhou city. The risk levels and probabilities of different hazard indicators were basically consistent with reality. The hazards with respect to industrial and mining accidents and road traffic accidents were extremely serious. More than 80 deaths from industrial and mining accidents would occur almost every 3 years, and more than 400 deaths of RTA would occur almost every 2.6 years. Moreover, centralized intervals of the risk level associated with five hazards were found, where urban risks were more likely to happen and had higher vulnerability. It could provide guidance for the government’s urban safety management and policy-making.
1. Introduction
Urban safety is a complex system issue involving all components of society and its citizens [1–3]. At present, urbanization continues to accelerate in China. The population, functions, and scale of cities have continually expanded [4]. From 1978 to 2021, the urban permanent population had increased from 170 million to 901 million, and the urbanization rate had risen from 17.9% to 63.89%. Both the number of cities and urban population are developing rapidly in China. Meanwhile, a high concentration of urbanization elements, including population, buildings, gas, transportation, and production parks, has formed with the rapid development of urbanization [5–7]. The urban operation system has become more and more complex. When an urban hazard happens, it extremely tends to cause the chain effect of derivative damage of the public crisis. Therefore, it is imperative to conduct an urban regional risk assessment and strengthen the process safety management.
Many studies have been conducted to explore the possible regional risks caused by a single hazard in the fields of fire protection, construction, chemical industry, and environment [1, 8–11]. By the superposition of single risks, the regional risk involving multiple hazards had been simply described [12]. For instance, Chen et al. [13] constructed the disaster risk evaluation index system in accordance with Chinese reality and presented the urban risk ranking and risk map of 31 provinces. Zhang et al. [14] explored the relationship between urban spatial risk and the distribution of infected COVID-19 populations. Altindal et al. [12] carried out the seismic risk assessment of earthquake-prone old urban centers. Hoyos and Hernández [15] pointed out that hazard-consistent record selection was extremely important in the derivation of vulnerability models to use at a local scale, for sites with contributions from different tectonic regimes. Yang et al. [16] proposed a multiscale environmental accident risk assessment model to comprehensively evaluate the characteristics and impacts of environmental risk at different scales. As a huge carrier of society development, regional security has gradually evolved from qualitative description to quantitative analysis, deepened from certainty to uncertainty, and developed from random uncertainty to fuzzy uncertainty [11, 17–19]. For example, Zhou et al. [20] claimed that uncertainties existed in slope stability analysis and proposed a probabilistic method for landslide prediction. Zhao et al. [21] developed R scripts to implement the K-means algorithm and gap statistics validity index for clustering regional risk. Yang et al. [22] used a multilevel risk characterization method to show the evolutionary process of risk and to provide a scientific basis for the management of the urban agglomeration ecological risk. Xu et al. [23] developed a land-use-based urban growth scenario for the temporally and spatially explicit simulation of future urban growth in terms of buildings, roads, and electrical facilities while considering dynamic information. However, these publications mainly focus on a certain “point” or “face” of urban risks, such as drought, flood, and earthquake, and analyze a risk superposition under limited conditions. Few comprehensive studies on the overall regional risk have been conducted.
As a complex nonlinear fuzzy system, urban regional risk involves many factors and varies rapidly in reality [24–27]. Due to the spatiotemporal limitations, data collection and feedback are commonly difficult. In the process of urban regional risk assessment, information asymmetry and insufficiency are often encountered [28–30]. This is commonly called a small sample size problem. Information diffusion theory is a fuzzy mathematical method for centralized processing of incomplete information systems [31, 32]. With the aid of fuzzy mathematics, a single sample point can be processed to establish an information diffusion model. The relationship among variables is constructed via the diffusion function, and then, the incomplete information is appropriately expanded to compensate for information insufficiency in small sample size problems. Recently, information diffusion has been gradually introduced into specific disaster risk assessments such as flood, meteorology, fire, and earthquake [33–35]. For instance, Bai et al. [36] employed a genetic algorithm to improve the information diffusion algorithm and conducted that for monthly river discharge time series interpolation and forecasting. Hao et al. [37] claimed that the incompleteness, the nonclarity, and the uncertainty of the data must be addressed with the risk assessment and applied information diffusion algorithm in probabilistic analysis of grassland biological disasters risk. Sun et al. [38] proposed a diffusive foot-and-mouth disease model with nonlocal infections.
However, the overall regional risk assessment by information diffusion method was rarely accounted for. Moreover, it should be pointed out that the above studies mainly computed that by using a simplified model derived from molecular diffusion theory. A reasonable diffusion coefficient is crucial to system risk assessment. As mentioned earlier, uncertainty is a significant factor in urban regional risk assessment. Actually, it would also affect the determination of the diffusion coefficient. The optimization of the diffusion coefficient should be considered during the implementation of the information diffusion model.
To compensate for the aforementioned drawbacks, this study is intended to propose a regional risk assessment method for urban safety by introducing information diffusion theory. Furthermore, the entropy theory is combined into an information diffusion model to reduce the uncertainty. The remainder of the paper is organized as follows. a brief introduction to information diffusion theory is elaborated in Section 2; Section 3 adopts the entropy to optimize the diffusion coefficient; Section 4 illustrates the overall procedure of the regional risk assessment model for urban public safety; in Section 5, a case study is adopted to validate the proposed method; finally, Section 6 provides the conclusions and suggestions for future work.
2. Information Diffusion Theory
The information diffusion theorem is a fuzzy mathematical method for centralized processing of incomplete information systems [31]. In this method, each information sample point is supposed to have a tendency to develop into multiple information points in the process of transition from the original incomplete system to completeness [19]. Accordingly, the corresponding information expansion of the incomplete system can be carried out by mathematical methods to make up for the shortcomings of insufficient information. At the same time, the calculation of the membership function can be avoided, and the original information carried by the original data can be preserved to the greatest extent possible. Therefore, even under the condition of incomplete information, this method can predict the relationship between variables through a certain diffusion function.
To understand the principle of information diffusion, the following definitions should be first reviewed [37].
Definition 1. For a nonlinear relationship, any sample with a size smaller than the population is regarded as incomplete.
Definition 2. For a known sample set X, let W be the object’s true relation. X is called a correct data set for W if and only if there exists a model through which X is processed to obtain an estimate such that .
Based on the abovementioned two definitions, the information diffusion principle can be defined. Let X be a given sample, V be a subset of the universe U, and be a mapping from X × V to [0, 1]. is called a kind of information diffusion of X on V, μ is called a diffusion function, and V is called a monitor space. If and only if X is incomplete, there must exist a diffusion function μ (x) which can diffuse the quantitative information obtained at point x to v.
Based on the molecular diffusion theory, Professor Huang [19] proposed a one-dimensional normal diffusion function, as shown in the following equation:where h is the diffusion coefficient, which governs the domain of information diffusion; xi is the variable of X including m samples, i = 1, 2, … , n; is the element of universe U including n variables, j = 1, 2, … , m. Based on the “average distance model” and the “two-point proximity principle,” h can be calculated by the following equation:where ; .
From (2), it can be concluded that the value of h is mainly determined by the minimum a, the maximum b, and sample size n.
To equalize the numerical status of each set, the diffusion function is commonly normalized, as shown in the following equation:Then, the normalized information distribution formula can be defined as follows:Via the abovementioned information diffusion, the single-valued sample point xi is successfully transformed into a fuzzy subset with the membership function .
Next, let be the estimate of the probability associated with sample point xi at . can be expressed as follows:Then, the exceeding probability , which is the estimate used to assess disaster risk, can be computed by the following equation:
3. Optimization of Diffusion Coefficient Based on Entropy
From (1) and (2), it is clear that the diffusion coefficient h significantly affects the expansion of incomplete sample X. However, the traditional empirical calculation method, namely, (2) has typical uncertainty and lacks a sufficient theoretical basis. It might be remarked that the information entropy is capable of measuring the uncertainty and the randomness of the system. It is mainly used as a probability density function to quantitatively describe the information capacity of the system. The larger the entropy, the higher the uncertainty of the system. In other words, the more information we know about a system, the less uncertain it is.
According to Shannon’s theorem [39], the information entropy function can be expressed as follows:
With the aid of the maximum entropy principle, the maximum entropy H of the one-dimensional normal information diffusion function can be obtained from the following equation:
For a given sample, each random sampling event can be considered as an event with equal probability. Then, the entropy reaches a maximum, as expressed in the following equation:
According to (8) and (9), σ can be expressed as follows:
Due to that h = σ · Δn, where the average width , the diffusion coefficient h can be further modified as follows:
4. Urban Regional Risk Assessment Model Based on Information Diffusion
Urban regional risk assessment is a complicated issue commonly with incomplete information and numerous uncertainties [2, 28]. Fortunately, the information diffusion and information entropy are precisely the way to solve such problems. Therefore, in this study, the two methods are combined to assess regional risk in relation to urban public safety.
In the process of urban regional risk assessment, information asymmetry and insufficiency are often encountered. Therefore, it is difficult to raise the urban risk assessment from the “point” level of various hazards to the “face” of the region. As a set-valued fuzzy mathematical method, information diffusion theory is commonly used for risk assessment of small sample systems. In view of fuzzy set theory, the probability distribution can be regarded as a mapping from events to probability values. Accordingly, the single sample point can be processed by fuzzy mathematics to establish an information diffusion model. The relationship among variables is constructed via the diffusion function, and then, the incomplete information is appropriately expanded to compensate for information insufficiency in small sample size problems. Thus, the probability is employed as a risk measure to evaluate the risk level or vulnerability of the hazard. Generally, information diffusion can be implemented in two ways: (1) multisource information is distributed at different control points; (2) multiple information universes are expanded to obtain the fuzzy relationships of the system. In this study, the former is adopted to construct a risk assessment model of urban hazards. The samples of urban risk indicators are regarded as incomplete sample sets. By establishing the universe of each single-valued sample of hazard, the information distributions at different risk levels are computed by using information diffusion theory. In this case, an urban regional risk assessment model based on information diffusion and maximum entropy can be established, as shown in Figure 1.

Step 1. Determine the regional risk assessment index system. Let U = {u1, u2, … , um} and X = {x1, x2, … , xn} be the universe and sample set of urban risk indicators, respectively.
Step 2. Compute the diffusion coefficient h with the aid of entropy. According to the collected sample set in Step 1, the minimum a, the maximum b, and sample size n of each indicator X are determined. Then, the entropy H can be derived from equation (9). Via equations (10) and (11), h can be computed.
Step 3. Construct the normalized information distribution formula μxi (uj). Substitute h into equation (1), and then, a one-dimensional normal diffusion function can be established. With the aid of equations (3) and (4), can be defined, which can transform single-valued sample point xi into a fuzzy subset.
Step 4. Assess the regional risk for urban public safety. Estimate the probability p (uj) and exceeding probability associated with all risk indicators via equations (5) and (6).
5. Case Study
5.1. Database
Urban public safety risk assessment is a large and complex system that should cover the elements of urban security as much as possible. In order to validate the performance of the proposed method, the statistical data of urban death accidents in Hangzhou city, China, were selected as test samples [40]. Hangzhou is an international tourist city and a famous national historic and cultural city. It has a total area of 16850 square kilometers and a permanent population of 12.204 million. In 2022, Hangzhou will achieve a GDP of 1875.3 billion Yuan. Recently, more and more international conferences and events were held there, such as the G20 summit and the Asian Games. It is becoming the megalopolis of China and is faced with public safety risks far beyond small and medium-sized cities. Due to the rapid growth of the economy, city size, population, and traffic density, the urban safety risks in Hangzhou are becoming increasingly severe. As commonly used indicators in China, the death numbers of industrial and mining accidents (IMA), road traffic accidents (RTA), water traffic accidents (WTA), fishing vessel transportation and fishing accidents (FVTFA), and mortality per hundred million GDP (MHM-GDP) from 2005 to 2021 were studied to assess the safety production risk in the Hangzhou region. The original data for each sample are shown in Table 1.
5.2. Results
Let X1, X2, X3, X4, and X5 represent IMA, RTA, WTA, FVTFA, and MHM-GDP, respectively. Then, the sample set of urban regional risk indicators associated with Hangzhou city was first established. The sample size of each indicator was 17. The discrete domain was set as U = {U1, U2, U3, U4, U5}. All the minimum values of U1, U2, U3, U4, and U5 were set as 0. Due to the differences in loss among the five hazards, the intervals of risk levels Δ1, Δ2, Δ3, Δ4, and Δ5 should be reasonably selected to reflect the actual situation of safety production in Hangzhou. As the relative high death toll of U1 and U2, Δ1 = 10 and Δ2 = 50 were set to divide the risk levels of IMA and RTA. Δ3 and Δ4 were set as the minimum unit due to their relative low death number, namely, 1. Δ5 = 0.02 was selected according to the 17 years loss of MHM-GDP. Then, the U was accordingly constructed as follows: U1 = {0, 10, 20, … , 250}, Δ1 = 10, 26 levels in total U2 = {0, 50, 100, … , 2000}, Δ2 = 50, 41 levels in total U3 = {0, 1, 2, … , 20}, Δ3 = 1, 21 levels in total U4 = {0, 1, 2, 3, 4, 5}, Δ4 = 1, 6 levels in total U5 = {0, 0.02, 0.04, … , 0.7}, Δ5 = 0.02, 36 levels in total
By using (11), the diffusion coefficients of he associated with five indicators were computed. For comparison, corresponding values of h0 were also calculated by using (2). For instance, it was known from Table 1 that a1 = 30, b1 = 137, and n = 17. Via (9), H = 2.83 could be computed. Then, he = 27.51 and h0 = 17.96 associated with X1 could be resolved by (2) and (11), respectively. Similarly, the diffusion coefficients of X2, X3, X4, and X5 could be calculated, as shown in Table 2.
Based on the proposed urban risk assessment model, the established sample X could be appropriately expanded to the discrete domain U. With the aid of MATLAB, the fuzzy membership functions corresponding to each indicator could be obtained as follows:
According to sample X1, the probability and exceeding probability of IMA at different risk levels were calculated by using (5) and (6), as shown in Table 3. Similarly, corresponding results of RTA, WTA, FVTFA, and MHM-GDP could also be determined according to the samples X2, X3, X4, and X5, as shown in Tables 4–7.
5.3. Discussion
In the past 20 years, the regional risk of Hangzhou city has improved significantly. This can be clearly observed in Table 1. The overall risk level of Hangzhou city showed a downward trend as time proceeded. However, the proportions of different hazards significantly differed from each other, as illustrated in Figure 2. Of the five indicators, the risk related to IMA was medium but showed a fluctuating trend. The risk related to RTA was high but showed a downward trend. Obviously, the hazards of IMA and RTA were still the areas with high incidences of casualties and property losses, accounting for 11.15% and 87.64%, respectively. In contrast, the risk related to the WTA and FVTFA was lower and showed a dropping trend. The hazard prevention of them was remarkable, annual deaths of which had fallen into the single digits or not occurred. They were expected to achieve the terminal goal of risk-free by taking necessary targeted policies and technical safety measures in the future.

It was generally recognized that the diffusion coefficient was crucial to the performance of information diffusion. In this study, the entropy theory was employed to modify the traditional diffusion coefficient. Let He and H0 be the entropy value of information diffusion estimation under he and h0, respectively. Table 8 shows the results of H1 and H2 derived from two different strategies. It was obvious that as for any indicator X, H0 < He. This indicated that the probability of the information diffusion results computed by he was greater than that calculated by h0. The entropy-modified coefficient was more reliable and reasonable.
The proposed information diffusion method combining entropy could successfully estimate the urban regional risk. It could be seen from Tables 3–7 that the exceeding probability P (u) associated with different hazards decreased with the increase of risk level. However, the corresponding risk probability p(u) increased first and then decreased, indicating that there would be a concentration area of risk levels for each hazard. Figure 3 shows the risk probability distribution curves of five hazards by using MATLAB 7.0. For comparison, the estimation results derived from the traditional diffusion coefficient were also given. It was clear that the regional risk assessment results differed from each other. For IMA and FVTFA, the highest risk levels associated with he and h0 were consistent. However, there was a slight deviation in the probability p. For RTA, WTA, and MHM-GDP, both the highest risk levels and their probabilities p were different. These displayed again the importance of the correct diffusion coefficient. It could be imagined that once the diffusion coefficient was improperly selected, it would lead to a large deviation of urban regional risk, which would further affect the formulation of policies and measures for urban risk prevention and control. Therefore, it was necessary to optimize the diffusion coefficient in this study.

(a)

(b)

(c)

(d)

(e)
As shown in Figure 3(a), it can be observed that the risk level of 100 deaths associated with IMA in Hangzhou city had the highest value, namely, 9.23%. As the risk level further exceeded 100, corresponding probabilities showed a significant downward tendency. The risk level of more than 80 deaths occurred with high probability (P > 64.64%), which happened about every 1.5 years. This also meant that 80 deaths would exist almost every 3 years in the future. But that of more than 160 deaths was small (P < 14.35%), which occurred about every 7 years. The peak value of the death number associated with RTA was 700 (p = 5.47%), as shown in Figure 3(b). The risk level of more than 400 deaths occurred with high probability (P > 76.06%), which happened about every 1.3 years. Comparably, the risk level of more than 1000 deaths was small (P < 17.09%), which only happened in 2005. Figure 3(c) demonstrated that the peak value of the death number associated with WTA was 2 (p = 12.18%). The risk level of more than 1 death occurred with high probability (P > 89.42%), occurring about 1.1 years. However, that of more than 8 deaths was small (P < 19.02%), which occurred every 5.3 years. FVTFA shows the lowest risk due to the probability of zero death reaching 70.59%, as displayed in Figure 3(d). More than 1 death per year occurs about every 2–4 years. As for MHM-GDP, the peak value of the death number associated with RTA was 0.08 (p = 6.91%). Figure 3(e) demonstrated that more than 0.32 per year occurred with low probability (P < 14.37%), happening about every 7 years. From 2014, MHM-GDP was less than 0.08 and sharply declined. Overall, compared to the original data of urban hazards in Table 1, the abovementioned conclusions drawn by the proposed information diffusion method were basically consistent with the reality.
Based on the abovementioned analysis, it can also be found that a centralized distribution interval of the risk level frequency associated with each urban hazard existed, represented by S in this study. The corresponding results of 5 urban hazards are shown in Table 9. Obviously, urban risks in S were more likely to happen and had higher vulnerability. Similar to the abovementioned analysis results, IMA and RTA were the main urban hazards with relatively higher S values. It also illustrated that urban regional risks were inevitable during the rapid development of society. However, effective countermeasures could be adopted to not only reduce the likelihood of hazards but also prevent dangerous events. Therefore, based on the risk situation and risk development trend results, appropriate measures can be taken to reduce the risk. It is suggested that Hangzhou should strengthen the safety supervision of the IMA and RTA in the future.
Overall, the information diffusion method was easily carried out and capable of dealing with incomplete information events with high accuracy. It can provide guidance for the government’s urban safety management and policy-making. According to different hazards and risk levels, safety management measures can be formulated based on the actual state by resolving the estimated probabilities, so as to continuously improve the level of urban safety management.
6. Conclusion
In this study, information diffusion theory was introduced to assess regional risk for urban public safety. Meanwhile, the entropy theory was utilized to modify the diffusion coefficient to reduce the uncertainty. A framework of urban regional risk assessment model based on information diffusion and entropy was established. The regional risk of urban public safety in Hangzhou city was studied by using the proposed method. Some main conclusions can be drawn as follows.(1)The diffusion coefficient was crucial to the performance of information diffusion. The information diffusion results derived from entropy entropy-modified diffusion coefficient earned less uncertainty and randomness than the traditional method. Such capacity could reduce the estimated bias of urban regional risk and contribute to the formulation of policies and measures for risk prevention and control. With the aid of the modified method, the urban regional risk of Hangzhou city in China was successfully estimated.(2)In Hangzhou City, the peak risk levels of IMA, RTA, WTA, FVTFA, and MHM-GDP were 100 deaths, 700 deaths, 2 deaths, 0 death, and 0.08 deaths, respectively, which were basically consistent with the reality. Comparably, the hazards with respect to IMA and RTA were extremely serious. More than 80 deaths of IMA would occur almost every 3 years, and more than 400 deaths of RTA would occur almost every 2.6 years.(3)Centralized intervals of the risk level associated with five hazards in Hangzhou city could be found. Urban risks in such intervals were more likely to happen and had higher vulnerability, almost occurring every 1-2 years. Effective countermeasures could be formulated based on the actual state by resolving the estimated probabilities, so as to continuously improve the level of urban safety management.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
Xinlong Zhou conceptualized the study, developed the methodology, provided the software, validated the study, investigated the study, curated the data, visualized the study, prepared the original draft, reviewed and edited the manuscript, and acquired the funding. Xinhui Ning supervised the study, validated the study, formally analysed the study, investigated the study, and acquired the funding. Dongzhu Jiang conceptualized the study and reviewed and edited the manuscript. Peipei Gao curated the data, visualized the study, and reviewed and edited the manuscript.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant no. 52108315), the Natural Science Foundation of Hubei Province of China (Grant no. 2021CFB286), the Youth Science and Technology Research Program of Hubei Education Department (Grant no. Q20211404), and the Research Fund for the Doctoral Program of Hubei University of Technology (Grant no. BSQD2020052).