Abstract

Flood disaster is one of the natural disasters which cause the most serious economic losses, the most casualties, and the greatest social impact. Flood frequency analysis is very important for reducing flood disaster. In this paper, based on the flood data of Manas River and tools of Box–Cox and Johnson normal transformation, the nonparametric statistical method for flood frequency analysis is studied in order to analyze the adaptability between it and the rivers in arid region of north-western China. The calculation result of the fitness index is divided into two parts: high flood discharge and low flood discharge. One of the two evaluation indexes has an advantage in fitting, and the number of advantages of the three methods in each part has been counted. After analysis, for the flood peak discharge frequency of rivers in arid region of north-western China, the frequency curve of Johnson transformation fits best with empirical data. The high flood discharge advantage is 6, and the low flood discharge is 4. For the flood volume frequency of rivers in arid region of north-western China, Box–Cox transform fits well with empirical data at the high flood discharge frequency curve, and its advantage is 12; Johnson transformation has a better fit between the low flood discharge frequency curve and empirical data, and its advantage is 12. Therefore, it is the way of improving the precision of flood frequency analysis to use the method of P-III distribution and normal transformation comprehensively.

1. Introduction

In recent years, due to the impact of global climate change and human activities, hydrologic extreme disasters had increased greatly in most countries and regions, resulting in economic, ecological, and even life and property heavy losses [1]. As a hydrologic extreme event, flood disaster is one of the natural disasters which cause the most serious economic losses, the most casualties, and the greatest social impact [2]. As one of the main methods to accurately estimate the design value of hydrologic variables, flood frequency analysis is very important for reducing flood disaster. At present, flood frequency analysis has been widely used in the field of hydrologic design [3]. The current flood frequency analysis methods in China are mainly divided into two categories, the parametric statistical method and the nonparametric statistical method. The parameter statistical method is based on the presumed flood frequency distribution form, and the parameters of the population distribution are obtained through the samples, and the design value under the specified frequency is obtained from the population distribution [4]. In the case of reasonable distribution form assumption, the parameter statistical method can obtain more information from the samples and the calculation results are more accurate. However, if it is unreasonable, the calculation accuracy will be reduced. The nonparametric statistical method avoids the error caused by the assumed distribution form and directly calculates the design value of the specified frequency according to the measured flood samples. The nonparametric statistical method is more flexible and robust than the parametric statistical method. It is also a research hotspot at present and provides another research way for hydrologic frequency analysis [510]. The normal transformation transforms the original skewed distribution of the sample into the normal distribution and then uses its inverse transformation to calculate the design value of hydrologic variables (normal quantile) under the specified frequency. This process does not involve the content of parameter calculation method and takes normal distribution as the intermediate medium. It belongs to the nonparametric statistical method in theory [1113]. Relevant researches have shown that, after the normal transformation of single variable, the original skewed distribution to the normal distribution is a one-to-one monotonic increasing relationship, and the serry obtained by using the normal transformation can retain the sample information of the original skew distribution more completely [14, 15]. The research of Chen and Song [16] also pointed out that there is a good fitting effect between the design value obtained by the normal transformation and the measured serry. So, the normal transformation can be used in hydrologic frequency analysis.

The application of the normal transformation in the calculation of flood frequency in arid region is less. Johnson transformation is mostly used in quality management statistics [17] and processing nonnormal statistics problems [18]. Box–Cox transformation is often used to improve the skewness and heteroscedasticity of linear regression, and it is more suitable for hydrology than Johnson transformation, such as the research of Liang and Dai [11] and Li et al. [19] and others. As a typical river in arid region of north-western China, hydrologic extreme events often occur at Manasi River. In this paper, based on the flood data of Manas River and tools of Box–Cox and Johnson normal transformation, the nonparametric statistical method for flood frequency analysis is studied in order to analyze the adaptability between it and the rivers in arid region of north-western China.

2. Materials and Methods

2.1. Box–Cox Transformation

Box–Cox transformation is a normal transformation model proposed by Box and Cox in 1964. The model is [2025]

The inverse transformation of the model iswhere is the original serry to be transformed; is the output serry after transformation; λ is the transformation parameter, λϵ [−5, +5]. The method to determine the best λ is when λϵ [−5, +5], the optimal value of λ is the one where the minimum standard deviation of the Z serry defined by equation (3) is obtained [26]. When λ = 0, the transformation is logarithmic transformation, λ = −1 is the reciprocal transformation, and λ = 0.5 is the square root transformation.where is the geometric mean of the original serry, X is the original data serry to be transformed, and Z is the output serry after transformation.

Box–Cox normal transformation requires that each item of the serry to be transformed is greater than 0. That is xi > 0. Each item of hydrologic serry is greater than 0, so it meets the transformation requirements.

2.2. Johnson Transformation

Johnson transformation is a normal transformation model based on three distribution curves proposed by Johnson in 1949. The model is shown in Table 1 [2730].

Chou et al. [3234]. is the original serry to be transformed; is the output serry after transformation; ε and γ are position control parameters; λ and η are scale parameters and are generally positive. In the Johnson normal transformation, the calculation of the parameters to be estimated is based on the method proposed by Hill et al. [31] and Chou et al. [3234].

2.3. Study Area

The Manas River is located in the northern foot of the middle section of Tianshan and on the southern edge of the Junggar Basin. It is the largest river on the northern slope of the Tianshan. It originates from the Erenhabirga Mountains on the northern slope of the Tianshan. It is about 324 km in length, and the drainage area is about 5156 km2 [35]. Kenswat Hydrologic Station is midstream of Manas River, which was built in 1955. The station controls the flow of Manas River (Figure 1). The hydrologic data have been compiled and reviewed by the Hydrology and Water Resources Bureau with reliable accuracy [36].

2.4. Data Acquisition and Processing

In this paper, the measured flood data of Kenswat Hydrologic Station with the longest observation time of Manas River are used as the measured serry. Kenswat reservoir is one kilometer upstream of Kenswat Hydrological Station. The construction of the reservoir started on August 7, 2009, and officially began to impound on December 6, 2014. Since the impoundment of the reservoir, the consistency of hydrologic data of Kenswat Hydrologic Station had been destroyed. Therefore, the data period selected in this paper is from 1955 to 2014.

According to the data analysis of the station, the average annual flood peak discharge of the Manas River is 356 m3/s. The measured maximum flood peak discharge is 1095 m3/s (August 2, 1999), the second is 758 m3/s (July 28, 1966), and the third is 735 m3/s (July 18, 1996). The annual maximum sampling method is used to select flood peak discharge samples, the unified sample method is adopted, and the Weibull equation is used to determine the empirical frequency. Because the Manas River is a small river, when selecting flood volume samples, 1 d, 3 d, and 5 d are used as the flood volume calculation period.

3. Results and Discussions

3.1. Detection and Correction of the Mutation
3.1.1. Detection of the Mutation

(1) Flood Peak Discharge Serry. In order to ensure the accuracy of detecting the mutation, three methods are selected to detect the mutation at the same time, and the mutation point is finally determined by comprehensive analysis.

According to calculation and analysis, the skipping mutation point of Lee–Heghinan test is 1995, and the skipping mutation point of ordered clustering test and sliding T test is 1993. Combining other literature and test results, the flood peak discharge serry mutation point is determined to be 1993, and so the serry is divided into two subseries from the mutation point. The subseries before the mutation point shows a downward trend, and its average value is 334.08 m3/s. The subseries after the mutation point also shows a downward trend, and its average value is 442.87 m3/s. The average value of the two subseries is quite different, showing skipping change. According to calculation and analysis, the main mutation type of flood peak discharge serry is identified as skipping mutation, and the results of detecting the mutation are shown in Figures 2(a) and 2(b).

(2) Flood Volume Serry. The method of detecting the mutation of the flood volume serry is the same as that of the flood peak discharge. The test results are shown in Figures 2(c)2(h).

It can be seen from Figures 2(c)2(h) that the maximum 1 d, 3 d, and 5 d flood volume serry mutation points are all in 1993. It is consistent with the test results of flood peak serry. So, it can be determined that the main mutation type of Manas River flood serry is skipping mutation and the mutation point is in 1993.

3.1.2. Correction of the Mutation

(1) Flood Peak Discharge Serry. In this paper, the decomposition synthesis theory proposed by Xie Ping is adopted as the method of correcting the skipping mutation, and the calculation process is based on the previous researches [3739].

According to the decomposition synthesis theory proposed by Xie Ping, the equation of correcting the skipping mutation iswhere is the item of the serry after correcting the mutation; xk is the item of the serry before correcting the mutation; and j is the year of mutation (here is 1993).

Based on the subseries before the mutation point, the subseries after the mutation point is corrected by equation (4). The corrected synthetic serry is detected again, and no mutation point is found, so the corrected serry meets the premise of consistency. The corrected flood peak discharge serry is shown in Figure 3(a).

(2) Flood Volume Serry. The method of correcting the mutation of the flood volume serry is the same as that of the flood peak discharge. The corrected results are shown in Figures 3(b)3(d).

3.2. Normal Detection and Transformation
3.2.1. Normal Detection

(1) Flood Peak Discharge Serry. The normal detection is carried out on the flood peak discharge serry after correcting the mutation. The detection methods are the nonparametric Shapiro–Wilk test (W test) and normal P–P diagram method. The normal P–P test results are shown in Figure 4(a), and the W test results are shown in Table 2.

According to Figure 4(a) and Table 2, the normality of the flood peak discharge serry is not significant, so the normal transformation cannot be directly used to calculate the flood peak discharge frequency. Therefore, the skewed serry is transformed into the normal serry by the methods of Box–Cox transformation and Johnson transformation, and the normal test results of the transformed flood peak discharge serry are shown in Figures 5(a) and 5(b).

It can be seen from Figures 5(a) and 5(b) and the results of normal transformation for the corrected flood peak discharge serry that the effect of the Box–Cox normal transformation is poor, and the effect of Johnson normal transformation is better.

(2) Flood Volume Serry. For the normal test of the flood volume serry, the method is the same as that of the flood peak discharge.(I)Maximum 1 d flood volume serryThe normal test is performed on the maximum 1 d flood volume after correcting the mutation, and the test results are shown in Figure 4(b) and Table 2. The normal test results of the maximum 1 d flood volume serry after normal transformation are shown in Figure 5(c) and 5(d).(II)Maximum 3 d flood volume serryThe normal test is performed on the maximum 3 d flood volume after correcting the mutation, and the test results are shown in Figure 4(c) and Table 2. The normal test results of the maximum 3 d flood volume serry after normal transformation are shown in Figure 5(e) and 5(f).(III)Maximum 5 d flood volume serryThe normal test is performed on the maximum 5 d flood volume after correcting the mutation, and the test results are shown in Figure 4(d) and Table 2. The normal test results of the maximum 5 d flood volume serry after normal transformation are shown in Figures 5(g) and 5(h).

3.2.2. Normal Transformation

(1) Flood Peak Discharge Serry. After Box–Cox normal transformation for the corrected flood peak discharge serry, the new serry passed the normal test. The P-value of 95% confidence interval of W test is equal to 0.06. The mean value of the new serry is 13.78, and the standard deviation is 1.998.

By comparing SB, SL, and SU, the three types of Johnson normal transformation for the corrected flood peak discharge serry, we found that SU transformation is the best. After SU type of Johnson normal transformation for the corrected flood peak discharge serry, the new serry passed the normal test. The P-value of 95% confidence interval of W test is equal to 0.87. The mean value of the new serry is 0.06259, and the standard deviation is 0.9161. The parameters of the SU-type Johnson normal transformation for the corrected flood peak serry are as follows: γ = 0.831159, η = 1.12611, ε = 221.007, λ = 92.8704. So, the optimal equation of SU-type Johnson normal transformation is

The best inverse normal transformation is

(2) Flood Volume Serry. (I)Maximum 1 d flood volume serryThe normal transformation results of maximum 1 d flood volume serry are that the best transformation parameter of λ is equal to 0 for Box–Cox normal transformation and the best normal transformation equations are as follows for Johnson normal transformation.(II)Maximum 3 d flood volume serryThe normal transformation results of maximum 3 d flood volume serry are that the best transformation parameter of λ is equal to 0 for Box–Cox normal transformation and the best normal transformation equations are as follows for Johnson normal transformation.(III)Maximum 5 d flood volume serryThe normal transformation results of maximum 5 d flood volume serry are that the best transformation parameter of λ is equal to 0 for Box–Cox normal transformation and the best normal transformation equations are as follows for Johnson normal transformation.

3.3. Flood Frequency
3.3.1. Flood Peak Discharge Frequency Curve

Using the inverse transformation model of Box–Cox and Johnson normal transformation given above, the design value (normal quantile) of normal distribution under specified frequency is used to deduce the design flood peak discharge corresponding to each frequency of the original distribution and draw the flood peak discharge frequency curve. At the same time, based on P-III distribution, the design flood peak discharge under each frequency is calculated by using the Optimization Curve-Fitting Method [40, 41], and the P-III distribution of flood peak discharge frequency curve can be drawn. The design flood peak discharge under specified frequency of normal distribution obtained by two normal transformations is shown in Tables 3 and 4. The three kinds of flood peak discharge frequency curves are shown in Figure 6(a).

It can be seen from Figure 6(a) that the flood peak discharge frequency curve deduced by Johnson normal transformation fits the measured value best. The fitting goodness of Box–Cox transformation and P-III distribution cannot be seen directly, so further quantitative calculation is needed.

3.3.2. Flood Volume Frequency Curve

Based on Box–Cox normal transformation, Johnson normal transformation, and P-III distribution, the flood volume frequency curves of maximum 1 d, 3 d, and 5 d are deduced. And the results are shown in Figures 6(b)6(d).

3.4. Goodness-of-Fit Calculation

Using the two evaluation indexes of Mean Square Error (MSE) and Residual Sum of Squares (RSS), in this paper, the goodness of fitness between the measured values of high flood discharge (10%, 30%, and 50%) and low flood discharge (10% and 30%) with the corresponding designed values of different methods are calculated [3, 42, 43]. The results are shown in Table 5.

According to the calculation results of RSS evaluation index of flood peak frequency, the order of goodness of fitness between the measured value and the designed value is Johnson transformation > Box–Cox transformation > P-III distribution. And the calculation result of MSE evaluation index is the same as the RSS evaluation index.

According to the calculation results of RSS evaluation index of flood volume frequency, for the maximum 1 d flood volume serry, the order of goodness of fitness between the measured value and the designed value is Box–Cox transformation > Johnson transformation > P-III distribution at high flood discharge (10%, 30%, and 50%), and Johnson transformation > Box–Cox transformation > P-III distribution at low flood discharge (10% and 30%), and the calculation results of MSE evaluation index are the same as the RSS evaluation index.

For the maximum 3 d flood volume serry, the order of goodness of fitness between the measured value and the designed value is Box–Cox transformation > P-III distribution > Johnson transformation at high flood discharge (10%, 30%, and 50%), and Johnson transformation > Box–Cox transformation > P-III distribution at low flood discharge (10% and 30%), and the calculation results of MSE evaluation index are the same as RSS evaluation index.

For the maximum 5 d flood volume serry, the order of goodness of fitness between the measured value and the designed value is Johnson transformation > Box–Cox transformation > P-III distribution both at high flood discharge (10%, 30%, and 50%) and low flood discharge (10% and 30%), and the calculation results of MSE evaluation index are the same as the RSS evaluation index.

The calculation results of goodness of fitness are divided into two parts: high flood discharge and low flood discharge. The number of advantages of the three methods in each part is counted. The results are shown in Table 6.

4. Conclusion

In this paper, 60 years measured flood data of Manas River in north-western China are used to study the adaptability of flood frequency analysis based on normal transformation to Manas River, by comparing with the traditional P-III distribution. The conclusions are as follows:(1)For the flood peak discharge frequency, Johnson transformation is best at the fitting accuracy between frequency curve with empirical data among the three methods of Johnson transformation, Box–Cox transformation, and P-III distribution. So, Johnson transformation is more excellent and adaptive than the traditional P-III distribution in analysis of the flood peak discharge frequency of Manas River.(2)For the flood volume frequency, Johnson transformation has strong adaptability to stable hydrologic serry, but Box–Cox transformation has strong adaptability to unstable high flood discharge of hydrologic serry. The main reason is that Johnson transformation is a multiparameter transformation and that Box–Cox transformation is a single-parameter transformation. The multiparameter transformation is more stable than the single-parameter one, so does the inverse transformation. Therefore, there is a higher fitting accuracy between all part of frequency curve with empirical data of Johnson transformation than Box–Cox transformation. But at the high flood discharge part of frequency curve, the fitting accuracy of Box–Cox transformation is higher than that of Johnson transformation.(3)The P-III distribution frequency analysis method has its unique advantages, and the calculation results are relatively stable. When the distribution of the original hydrologic serry is more consistent with the P-III distribution, more information can be obtained from the sample, and the calculation results are more accurate, but when the distribution of the original hydrologic serry is not consistent with the P-III distribution, it will cause more errors. The results of this paper show that the normal transformation method has certain advantages and rationality at flood frequency analysis of rivers in arid island region of north-western China. Therefore, the precision of flood frequency analysis can be improved by using P-III distribution and normal transformation comprehensively.

Data Availability

The data of Kenswat Hydrologic Station with the longest observation time of Manas River are provided by Kenswat Hydrological Station. The data period selected in this paper is from 1955 to 2014.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was financially supported by the Xinjiang Production and Construction Corps (nos. 2022DB024 and 2021AA003) and is gratefully acknowledged.