Abstract
The accuracy of seismic demand models in seismic vulnerability analysis of structures or components mainly depends on the seismic intensity measures (IMs) and engineering demand parameters (EDPs). This paper proposes a novel method to obtain the optimal seismic demand model for the seismic vulnerability analysis of bridges. The method obtains the IM-EDP combination by matching all IMs and EDPs within a wide range one by one, considering the contribution of multiple IM parameters to the seismic response of the structure and avoiding the blindness of EDP selection. The IM is determined by calculating Pearson correlation coefficient and partial correlation coefficient, controlling the correlation between EDP and IM (or IMs) to a minimum to reduce the multicollinearity within the vector IMs and avoid ill-conditioned models. The optimal seismic demand model is obtained by inspecting the scatter plot and residual plot of suboptimal seismic demand models determined from all combinations by guaranteeing efficiency and sufficiency. The efficiency of seismic demand models is guaranteed by controlling the root mean square error (RMSE) and the coefficient of determination (R2). The sufficiency of models is guaranteed by controlling the slope of fitted line. A continuous rigid frame bridge with double thin-walled piers is used as a case study and a dynamic time-history analysis is performed to obtain the seismic vulnerability of bridge with the proposed method. The results show that the proposed method is feasible and ideally suited for optimizing seismic demand model.
1. Introduction
As an important part of performance-based earthquake engineering, the purpose of seismic vulnerability analysis is to predict the damage degree of structures under varying levels of earthquake intensity, whereas the results of seismic vulnerability analysis are heavily influenced by the seismic demand model of the structure [1]. Seismic demand models (SDMs) describe the functional relationship between seismic intensity measures (IMs), which are a measure of the earthquake’s intensity, and structural engineering demand parameters (EDPs), which are a measure of the magnitude of the structural response under earthquakes. On-site investigations of structural damage due to earthquakes show that, for a given earthquake, the structures in the vicinity of epicenter are not necessarily more damaged than the structures of the same type further away from the epicenter. Additionally, the structural damage caused by small earthquakes is possibly more serious than that caused by large earthquakes [2, 3]. This is because earthquake damage depends not only on the amount of energy released in the earthquakes, but also on other factors such as the hypocentral distance, the soil type of the site, and structural characteristics [4]. Therefore, the applicability of SDMs in seismic vulnerability analysis should be determined rigorously through optimization within a sufficiently wide range of IMs and EDPs.
In most current seismic standards, a single scalar value of IM, usually in the form of Peak Ground Acceleration (PGA) or Peak Ground Velocity (PGV), is specified for seismic design. These IMs were chosen considering the short vibration periods of common low-height structures, and the various simplifications afforded by some assumptions. However, some of these assumptions are too restrictive for many contemporary and modern structures [5]. It is therefore not advisable to use the IMs recommended in various technical specifications without any verification when performing seismic vulnerability analysis on modern structures [6].
The effect of IMs on structural seismic performance has attracted increasing attention of researchers around the world in the last few decades [7–10]. Several indicators for evaluating the applicability and adequacy of IMs have been proposed [11–13]. Accordingly, some IMs have been found to work better than others in specific situations [14, 15]. For example, Akkar and Özen [16] proposed that, for the inelastic deformation of the structure, PGV had more accuracy and predictive ability than PGA or PGV/PGA. On the other hand, the acceleration response spectrum parameters related to the natural vibration period of the structure have better predictive ability for long-period structures [17–19].
Since a single scalar IM is often inadequate to completely describe the damage of structures or components, a vectorial IM was proposed instead of scalar IM for structural seismic demand models [20, 21]. However, the existing research about vector IMs has some deficiencies in terms of selecting parameters (IMs and EDPs). Firstly, only the overall correlation between the vector IMs and the EDPs seems to have been emphasized, ignoring the correlation between the multiple IMs contained in a vector IM (i.e., multicollinearity). The inclusion of higher multicollinearity within one vector may result in erroneous seismic demand models by the multiple regression analysis [22]. Secondly, the selection of EDPs relies excessively on previous research experience. Most research of seismic vulnerability analysis for bridges appears to be based on previous work and subjective judgement to determine only one EDP for components. Typically, the displacement or displacement ductility ratio of pier top and the curvature or the curvature ductility ratio of pier bottom were used as the EDPs for the piers, the shear strain was used as the EDP for the bearings, and the displacement was used as the EDP for the abutments [23–25]. Buyco and Jayaram et al. [26, 27] found that the selection of EDP had a significant influence on the seismic vulnerability analysis results. Furthermore, they also found that the selected EDPs affect the efficiency and sufficiency of IMs.
Considering the complexity of the site type and structural damage mechanism, it is necessary to carry out accurate and rigorous analysis to determine the relative optimal parameters for seismic demand models. Towards that end, this paper proposes a method of seismic demand analysis for bridge structures. The crux of the method lies in the identification of the relative optimal parameters from multiple IMs and EDPs simultaneously for constructing the optimal IM-EDP combination, so as to obtain the optimal seismic demand model. The model contains 1 EDP and 1 IM (scalar IM) when used in vulnerability curve analysis, whereas 1 EDP and 2 IMs (vector IM) are used for vulnerability surface analysis. Taking a continuous frame bridge with double thin-wall piers as an example, the optimal seismic demand model of the bridge is obtained by using the method and seismic vulnerability analysis is performed.
2. Seismic Demand Analysis Method
For the seismic vulnerability analysis of bridges, it is necessary to take dominating components as the research object, construct the seismic demand model of these components, and then obtain the seismic vulnerability of bridges through some combination method. The basic methodologies for obtaining seismic vulnerability curves and seismic vulnerability surfaces are consistent with each other. Therefore, in the following, the process of obtaining the optimal seismic demand model for seismic vulnerability surfaces by the proposed method is illustrated with a bridge pier as the structure of interest.
2.1. Preliminary Selection of IMs and EDPs
A sufficient quantity of representative IMs and EDPs must be selected in the beginning of seismic demand analysis. The selected IMs should cover multiple types such as acceleration, velocity, and displacement type IMs as well as structure-related IMs, while the selected EDPs should include damage indexes at both the component and the material levels. Here, we assume that m IMs and n EDPs were preliminarily selected according to the above principles with m ≥ 3 and n ≥ 2.
2.2. Correlation Analysis
In an ideal seismic demand model, EDPs should have a sufficiently high correlation with IM parameters [28]. At the same time, it is necessary to avoid high multicollinearity within vector IM for obtaining an accurate demand relation. In order to ensure high correlation between vector IMs and EDPs while simultaneously reducing the multicollinearity within vector IMs, a simple correlation analysis was performed for IMs and EDPs, whereas a partial correlation analysis was performed for the vector IMs, as outlined next.
2.2.1. Simple Correlation Analysis
Simple correlation analysis is used to determine the level of linear correlation between two variables. In this study, the Pearson correlation coefficient, one of the most commonly used coefficients, is used as a measure of the correlation between each IM and EDPs. The Pearson correlation coefficient of matrices X and Y is defined bywhere X and Y are N-by-1 matrices, N being the number of ground motion records. and represent the sample values comprising matrices X and Y, respectively. The values of range from −1 to 1. When , the variables X and Y are completely correlated, whereas indicates that no linear correlation exists between X and Y. IMs and EDPs were calculated m × n times. According to (1), m × n Pearson correlation coefficients can be obtained. Then, sequencing the IMs according to the correlation with every EDP, n sequences are obtained. The process is shown in Figure 1.

Finally, removing the last k IMs in each sequence that have no significant correlation with all EDPs, m-k IMs are left (it is generally believed that there is a significant correlation when ).
2.2.2. Partial Correlation Analysis
All IMs are calculated based on identical ground motion records, which results in a correlation between different IMs. Therefore, there could be high multicollinearity in the vector IMs. In order to find weakly correlated IMs to construct vector IM, a partial correlation analysis was performed for calculating the correlation between two discretionary IMs, excluding the influence of other IM parameters. Assuming eliminating the influence of , the partial correlation coefficient between IM1 and IM2 is defined bywhere is the i-th row and j-th column element of the inverse ∑-1 of the covariance matrix ∑ composed of m-k variables. The values of range from 0 to 1. When , variables X1 and X2 are said to be completely correlated, whereas indicates that the variables are not linearly correlated. The partial correlation coefficients between m-k number of IMs were calculated according to (2) to obtain partial correlation coefficients.
To take just one of the n sequences as an example, the correlation between IMs and the EDP from strong to weak is assumed as IM1>IM2>...>IMm-k. We begin by comparing the partial correlation coefficients successively between IM2, ... IMm-k and IM1, and eliminating the IMs that are most closely related to IM1. The remaining IMs along with IM1 constitute the vector IM successively. If IM2, ... IMm-k are all significantly related to IM1, then IM1 is discarded and the partial correlation coefficients between IM3, ... IMm-k and IM2 are compared in turn. The above steps are repeated for the entire range of the variables in the vector IM.
Suppose that a total of t vector IM-EDP combinations are finally selected at the end of the above procedure. The procedure ensures that these t combinations have little multicollinearity within any vector IM.
2.3. Multiple Linear Regression Analysis
Assume that a vector IM consists of IM1 and IM2, and ln (IM) obeys a linear relation with ln (EDP). The relationship between the vector IM and EDP can then be represented by a multiple linear regression model (in (3)), wherein RMSE (root mean square error) and R2 (coefficient of determination) are calculated by (4) and (5), respectively.
Here, is the logarithmic regression expectation of EDPs, is the logarithmic sample value of EDPs, b0, b1, and b2 are computing coefficients, and is the logarithmic sample value of IMs. RMSE is a measure of how concentrated the data is around the line of best fit, which reflects the accuracy in the calculation of the linear fitting of sample data. The RMSE value is always greater than 0, and the smaller its value, the more accurate the calculation. The value of R2 indicates the proportion of the total variation of the EDPs that can be interpreted by the IMs, which is a measure of the quality of a regression equation.
Multiple linear regression analyses of t number of vector IM-EDP combinations were performed according to (3), and t seismic demand models were obtained, from which t values of RMSE and R2s were subsequently obtained using (4) and (5), respectively.
2.4. Efficiency and Sufficiency
In order to evaluate the goodness of fit of the t seismic demand models, three indexes (RMSE, R2, and bi) were used in the following two standards to select the models with ideal efficiency and sufficiency simultaneously.
2.4.1. Standard 1: Judge Efficiency
The t seismic demand models were ranked according to the values of RMSEs and R2s, respectively. In the first sequence, the models are arranged with the RMSEs in ascending order, whereas in the second sequence, the models are arranged with R2s in the descending order. The upper and lower limits of RMSE and R2 are, respectively, defined, and then the models with RMSE greater than the specified upper limit along with the models with R2s less than the specified lower limit are removed. The models that remain have an R2 value larger than the specified lower limit, as well as an RMSE value smaller than the specified upper limit, and hence these models are characterized by better efficiency. Let us assume that, out of the t models, only p seismic demand models which ranked the highest in the efficiency criterion are selected.
2.4.2. Standard 2: Judge Sufficiency
The remaining p seismic demand models are then sorted in the descending order of slopes b1 and b2. Two separate limit values are defined for b1 and b2. The models for which both b1 and b2 are greater than the specified limits are discarded. The remaining models with larger b1 and b2 values have better sufficiency.
2.5. Determine the Optimal Seismic Demand Model
No index can guarantee an absolute validity of a linear regression equation but can only serve as an auxiliary tool to increase the efficiency in the workflow. Some contingencies could invalidate RMSE, R2, and b, resulting in the wrong model (Figure 2). Furthermore, the hypothesis that there exists a linear relationship between the logarithms of IM and EDP may not hold true for all parameters. It is therefore important to ascertain that the scatter plots and residual plots meet certain tolerance requirements. The ideal seismic demand model should be such that the points in the scatter plot are evenly distributed along the regression line. The points in the residual plot must not show heteroscedasticity; that is, the distribution of each point is scattered without any discernible variation or pattern. Once these requirements are met, the optimal seismic demand model is determined, which has larger R2, b values and a smaller RMSE value.

(a)

(b)

(c)

(d)
The flowchart of the above five steps is shown in Figure 3.

3. Seismic Vulnerability Analysis of Bridge Components
The seismic vulnerability of bridge components is the probability that the component response exceeds the specified seismic capacity of the component under earthquakes. The seismic vulnerability surface based on vector IM is given bywhere is the failure probability of the bridge components, is the unary standard normal cumulative distribution function given by , and is the seismic capacity of the bridge components corresponding to the damage level l, whereas and are the logarithmic standard deviations of the seismic capacity and seismic demand of components, respectively.
4. Case Analysis
4.1. Bridge Model
4.1.1. Engineering Background
A four-span prestressed concrete continuous rigid frame bridge (66 + 2 × 120 + 66 m) is used for illustration. The width of the bridge deck is 27.5 m. The superstructure of the bridge adopts a double-box single-chamber section. The beam depth is 2.86 m at the midspan and support positions, and 7.26 m at the pier top. The concrete strength of beam is 55 MPa. The piers are of a double thin-walled form (7.25 × 1.3 m) with 50 MPa concrete and 335 MPa steel reinforcement. The size of the platforms is 27 × 12.6 × 5 m (length × width × height), with a concrete strength of 35 MPa. The foundation system of piers is comprised of 12 cylindrical piles with the diameter of 2.2 m under each platform using 35 MPa concrete. The abutments are of the U-shaped form, and four basin rubber bearings are arranged on each abutment. The width of the expansion joint between the beam and abutment is 0.24 m. The dimensions of the bridge are shown in Figure 4.

According to Chinese seismic design code, the seismic fortification intensity of the bridge is VI degree, the designed PGA is 0.05 g, the seismic characteristic period value is 0.35 s, and the site soil type is medium hardness soil.
4.1.2. Finite Element Model
The finite element model of bridge was constructed in the OpenSEES program [29]. The superstructure and platform were simulated with elastic beam-column elements. The section of piers was modeled with fiber section, and the concrete and steel of piers were simulated with Concrete02 and Steel02, respectively. The elements of piers were simulated with force-based nonlinear beam-column elements. The beam was rigidly connected with the top of piers. The piles at the pier foundations were idealized as equivalent zero-length soil springs to consider the interaction between piles and soil. The bearings were simulated with the Bouc-Wen hysteresis elements available in the OpenSEES program [30]. Considering the influence of the abutment piles and the backfill on the abutment stiffness, zero-length elements were used to simulate the abutments. Using gap material and truss elements, the collision behavior of expansion joint was modeled. The finite element model of the bridge is shown in Figure 5.

4.1.3. Model Parameters
The yield displacement and yield force of bearings were set to 0.004 m and 112 kN, respectively [31]. The horizontal stiffness of bearings was taken to be 26353 kN/m. The abutment’s stiffness was calculated in accordance with the Caltrans seismic design criteria [32]. The longitudinal and lateral stiffness of abutments were calculated to be 1.2 × 105 kN/m and 1.036 × 105 kN/m, respectively. The collision stiffness of the expansion joint was taken as 1/3 of the minimum line stiffness of the beam, and the yield force was taken as the yield force of the abutment [33]. The collision stiffness and yield force were 1 × 106 kN/m and 3.05 × 107 kN, respectively. The pile’s stiffness was calculated according to Chinese code [34]. The equivalent stiffness of piles is shown in Table 1.
4.2. Ground Motion Records
145 ground motion records were selected from the NGA-West2 strong earthquake database of the Pacific Earthquake Engineering Research Center (PEER) (Table 2). The acceleration response spectrum with 5% damping ratio and the parameter distribution of records are shown in Figures 6 and 7, respectively.(1)Shear wave velocity of 30 m soil layer (Vs30): 200 m/s < Vs30 < 540 m/s,(2)Magnitude (Mw): 5 < Mw < 8,(3)5%–95% energy duration (Td95): 0 < Td95 < 50,(4)Source distance (Rjb): 0 < Rjb < 250.


4.3. Seismic Demand Analysis
Dynamic time-history analysis for the rigid frame bridge with double thin-walled piers was performed with selected ground motion records on the OpenSEES program. The 3# pier was chosen as an example to illustrate the application of the method.
4.3.1. Preliminary Selection of IMs and EDPs
According to previous research, four engineering demand parameters of piers were used for the seismic demand analysis of 3# pier [35–38]. The displacement of the pier-top (u), the curvature of the pier-bottom (κb), and pier-top (κt) and the strain in the longitudinal steel bar (ε) of the piers were selected. In order to make the coverage of IMs as wide as possible, twenty-four IMs were selected based on a closely related research [5,10]. The selected IMs are listed in Table 3.
4.3.2. Calculation of Correlation Coefficients
(1) Pearson Correlation Coefficients. The linear correlation between each IM and EDPs was obtained by calculating the Pearson correlation coefficients between them using Equation. (1). The results of Pearson correlation coefficient are shown in Figure 8. Figure 8(a) shows sequences of IMs arranged in order of their correlation with the different EDPs, wherein the red portion indicates correlation coefficients greater than 0.9, blue portion indicates correlation coefficients between 0.8 and 0.9, and the black portion indicates correlation coefficients less than 0.8. Figure 8(b) shows the linear correlation between 24 IMs and 4 EDPs. In general, u has a good correlation with most IMs while ε has the worst correlation with most IMs. Compared with the frequently used PGA, the performance of PGV is more satisfying. Here, using 0.8 as the limit and not considering ε, the IMs that have correlation coefficients with EDPs greater than 0.8 in each sequence were selected as the objects for further analysis. In the end, 15 IMs with the highest correlations were selected among the 24. These are shown with bold dotted lines in Figure 8(b).T1 is natural vibration periods of the first order and T2 is natural vibration periods of the second order.

(a)

(b)
(2) Partial Correlation Coefficients. In order to select the IMs that have the least correlation with each other to form the vector IMs, the partial correlation coefficients between different IMs were calculated using (2). The results of the partial correlation analysis are listed in Table 4, wherein the three sequences from the last step are placed on the top-left. In the table, this is shown with different colors according to the values of the partial correlation coefficients. Here, considering an upper limit of 0.1, only those IMs with a partial correlation coefficient less than 0.1 were selected as candidates. Among these IMs, those with the highest correlations with EDPs in each sequence were identified to form vector IMs. The specific operation involved adding the rankings of any two IMs within the same sequence. A smaller sum indicated a stronger correlation between the vector IM and the EDP, signifying a more satisfactory result. In this particular case study, the vector IMs with a sum less than 7 were selected for combination with EDPs and a total of 14 combinations of vector IM-EDP were obtained (Table 5).
4.3.3. Efficiency and Sufficiency
First, a binary linear regression analysis was performed on the 14 combinations to obtain 14 binary linear regression equations (i.e., seismic demand models). Each model contains two slopes, b1 and b2. Then, the corresponding RMSE and R2 were calculated using (4) and (5), respectively.
The RMSE, R2, and slopes b1 and b2 are shown in Figure 9. A combination with a large R2 and a small RMSE indicates better efficiency. On the other hand, larger values of b1 and b2 in a combination signify better sufficiency. Here, the limits of R2, RMSE, and b are set to 0.8, 0.5, and 0.5, respectively.

According to Figure 9, C11, C12, C13, and C32 are highly effective (R2 values are greater than 0.8, RMSE values are less than 0.5); however, one of the combinations has one of the b values much greater than the other, which indicates that one of the IMs in this combination did not significantly affect the structural damage. C21, C22, C23, and C24 have large RMSE values. The b1 and b2 values of C31, C33, C41, C42, and C43 are all less than 0.5. Therefore, C14 was chosen as the candidate of the optimal earthquake demand model, and further investigation was conducted.
4.3.4. Determination of the Optimal Seismic Demand Model
The purpose of this step is to determine the optimal seismic demand model through visual inspection of the scatter plot and the residual plot of C14 (Figure 10). In the two-dimensional and three-dimensional scatter plot of C14 model, all the data points lie sufficiently close to the regression line. This shows that there is a prominent linear correlation between the EDP (u) and IMs (VSI and SA (T2)). The b1 and b2 values are greater than 0.5 in Figure 10(a), while in Figure 10(b), b1 and b2 values are 0.91 and 1.2, respectively. This shows that IM has an adequately large impact on the EDP. The points in Figure 10(c) represent residual values and the two curves represent the 95% prediction interval of the residual values. Wherever the curves exceed the red dashed lines at a point, this indicates that the point is an outlier. It can be noted that although there are 7 outliers, the residual values are evenly distributed on both sides of the central axis, which indicates the seismic demand model is reliable. Therefore, the C14 model was selected as the optimal seismic demand model.

(a)

(b)

(c)
To summarize the implementation of the proposed method to select the optimal seismic demand model, first, IM and EDP were matched. Then the vector IM and EDP combined by the two IMs were matched again, and the vector IM-EDP combinations with better performance were selected. Then, the efficiency and sufficiency of the seismic demand models obtained from these combinations were inspected to select the best seismic demand model. Finally, the scatter plot and residual plot of the selected model were examined for confirming it as the optimal seismic demand model. For the bridge example in this article, the displacement of pier-top (u) is the best EDP, whereas the combination of PGA and u recommended by the Chinese standard does not perform well as the vector IM.
In the optimal seismic demand model obtained, both VSI and SA (T2) have the high correlation with u (Figure 10(b)), and the vector IM comprised of VSI and SA (T2) still matches u well (Figure 10(a)). In addition, VSI and SA (T2) have very little correlation (Table 4) among themselves. This shows that the past seismic demand model including a single IM is bound to miss some important information.
Finally, the seismic vulnerability surface of the 3# pier was obtained by substituting the seismic demand model of C14 into (6) (see Figure 11). The damage state of bridge piers was divided into 4 types: slight damage, moderate damage, serious damage, and collapse damage, corresponding to u = 0.3 m, 0.4 m, 0.8 m, and 1.4 m, respectively.

(a)

(b)

(c)

(d)
It can be seen from Figure 11 that the seismic vulnerability surface reflects the influence of two IM parameters on the probability of bridge component failure, which contains more information than the corresponding seismic vulnerability curve. In the example, VSI and SA (T2) have the same degree of influence on the seismic vulnerability of bridge piers. If SA (T2) calculated from a certain ground motion record is large while VSI is small, the bridge piers may not be severely damaged under the action of the earthquake.
5. Conclusions
This paper presented a method to select the optimal seismic demand model for bridge. The method obtained the optimal seismic demand model containing one EDP and two IMs (vector IM), considering the contribution of multiple IM parameters to the seismic response of the structure. Both IM and EDP parameters were obtained through optimization within a wide range, thus avoiding the blindness of parameter selection in the seismic demand model. Compared to the current analysis method, the proposed method has the following advantages:(1)The proposed method accounts for the fact that a single IM cannot completely reflect the damage of bridge components. When bridge structures enter the plastic stage under strong earthquakes, the dynamic behavior of the structure changes, often resulting in longer natural vibration periods. A single IM may not be adequate for describing damage in the different damage regimes of the structure. The proposed method can solve this problem by determining multiple IMs through a rigorous optimization of parameters.(2)The proposed method is universally applicable, but the optimal seismic demand model calculated in the present study has the best applicability to the analyzed bridge. Because the earthquake response mechanism of different types of bridges at different sites may not be consistent, empirically adopting IMs and EDPs based on previous experience may cause significantly erroneous results. The proposed method solves this problem by optimizing multiple EDPs and IMs within the appropriate scope from the calculation results of the analyzed bridge to obtain the optimal seismic demand model.(3)Compared to existing methods, the constructed vector IMs guarantee a stronger correlation between IMs and EDPs and reduce the multicollinearity within the vector IMs. By calculating Pearson correlation coefficient and partial correlation coefficient, the correlation between EDP and IM (or IMs) is controlled to a minimum to avoid ill-conditioned models.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
The authors gratefully acknowledge the support of the Natural Science Foundation of China (Grant no. 51808376) and Jiangxi Youth Science Foundation (Grant no. 20192BAB216032). The authors also acknowledge Dr. Jing Liu who made valuable suggestions to increase the technical quality of the paper. The authors also express their thanks to Pacific Earthquake Engineering Research Center (PEER) for opening the NGA-West2 database for providing ground motion records.