Abstract

In practice, the data sets with extreme values are possible in many fields such as engineering, lifetime analysis, business, and economics. A lot of probability distributions are derived and presented to increase the model flexibility in the presence of such values. The current study also focuses on investigations to derive a new probability model New Flexible Family (NFF) of distributions. The significance of NFF is carried out using the Weibull distribution called New Flexible Weibull distribution or in short NFW. Various mathematical properties of NFW have been discussed including the estimation of parameters and entropy measures. Two real data sets with extreme values and a simulation study have been conducted so as to delineate the importance of NFW. Furthermore, NFW is compared with other existing probability distributions; numerically, it has been observed that the new mechanism of producing the lifetime probability distributions plays a significant role in making predictions about the population than others using the data sets with extreme values.

1. Introduction and Problem Statement

Due to wide applicability of the probability models in various disciplines, it remains a great interest of the researchers to increase the precision and validity of predictions and forecasting through probability functions in the presence of extreme values. For example, Ijaz et.al. [1] produced a new family of lifetime distributions and discussed its various statistical properties. The probability function of [1] can increase the reliability than others for the lifetime data analysis. Other prominent families of distributions include the Beta-G proposed by Eugene et al. [2], Jones [3], and Zografos and Balakrishnan [4] suggested a Gamma-G (type-1) family, Mc-G proposed by Alexander et al. [5], Amini et al. [6] delineate the Log-Gamma-G type-2 family, Gamma-G (type-2) studied by Ristić and Balakrishnan [7], Torabi and Narges Montazeri [8] discussed Gamma-G (type-3), Weibull-X family of distributions introduced by Alzaatreh et al. [9], and exponentiated generalized class of Cordeiro et al. [10]. For a detailed discussion on the development of such new families of distributions, we refer to see [1119].

The existing probability distributions have some limitations while modeling the lifetime data. First of all, these distributions are unable to model the nonmonotonic hazard function; for example, exponential distribution can only model the constant hazard rate function, while gamma distribution can only model the monotonically increasing failure rate function. Secondly, even if the existing probability functions model the data, they provide a bad fit to the real data. In practice, a lot of data sets follow a shape rather than the constant or monotonically increasing failure function. For instance, the lifetime of accident rate or an electronic device follows the pattern of nonmonotonic hazard rate function [1].

This paper presents another contribution to the existing theory of probability functions that will overcome the limitations of the existing distributions and some others recently developed.

2. Methodology to Proposed New Family of Distributions

A lot of probability distributions are introduced by defining the new generator called families of probability distributions; for example, [1] introduced a Gull Power Alpha family of distribution, and for other families, we refer to [6, 9, 15, 20, 21] and [2229]. In this paper, a new family (NFF) is presented by introducing a new scale parameter “a” which is given bywhere is the CDF of the baseline distribution.

The probability density function of is

3. Special Form of the NFF

This portion demonstrates the specific form of NFF by utilizing the CDF of the Weibull distribution called new Weibull distribution (NFW). The cumulative distribution function (CDF) of the Weibull distribution [30] is presented bywhere in the above equation represents scale parameter and represents the shape parameter.

By replacing (3) in (1), the CDF and PDF of NFW are, respectively, given by

Figure 1 presents numerous shapes of the CDF and PDF with distinct set of parameter values.

3.1. The Survival and Hazard Rate Function

The survival and hazard rate function of NFW is defined by

, and using (4), we get

After simplification, the final result is given as follows:

The hazard rate function of NFW is provided by

Using (4) and (5), we finally obtained

Finally, the following result is obtained:

Figure 2 demonstrates the nature of the hazard rate function for different values of parameter.

3.2. The Quantile Function and Median

The quantile function and median of NFW is defined bywhere is a standard uniform random variable.

Substituting (4), we obtain the result

The solution of equation (12) for will give the following result:

For median, consider q = 0.5 in equation (13).

4. The Rth Moments

The rth moments about origin, suppose is defined as

Recalling (5), we get

The simplified form of the above integral form is

Let , then , and .

By using the above substitution in (16), finally, we get

5. Order Statistics

Let X1, X2, X3,…, Xn be ordered random variables from NFW, then the PDF of the order statistic is given by

Using (4) and (5), the smallest and largest order statistic of NFW is defined byand the largest order statistic is

6. Parameter Estimation

To increase the efficiency of the probability models, an estimation of the parameters also plays a vital role. Since the maximum likelihood estimation (MLE) method provides the most reliable estimates, therefore, for estimation of the parameters of NFW, we considered the MLE approach. A comprehensive discussion on maximum likelihood estimation was given by [3133]. The likelihood function of (4) is defined by

By taking the log of the above function, we get

To obtain the estimate of the parameters, partial derivative with respect to various parameters (a, b, c) was taken and equated the result to zero:

The exact values of the parameters are not possible as equations (23) to (25) are not in closed form. Still, one can use the numerical methods to get the ML estimates.

6.1. Asymptotic Confidence Bounds

The asymptotic confidence bounds for the parameters of are derived based on asymptotic distribution. To derive the required asymptotic confidence bounds, we are required the second time derivatives of (23)–(25):

7. Renyi Entropy

The Renyi entropy of NFW is defined by

Using (4), we get

The more simplified form is

Let , then , and .

By substituting the above result in (29), we obtained

Finally, we obtained the result as

8. Mode

The mathematical form of the NFW for the mode can be illustrated as follows:

Using equation (4), we have

After simplifying the above expression, the result may be written as

The above function is an implicit function. It can be solved numerically under some restriction on the parameter values.

9. Skewness and Kurtosis

The general functions for calculating the skewness (S) and kurtosis (K) of the NFW are given as follows:

Table 1 depicts the skewness and kurtosis for different values of the parameters.

10. Special Cases

Following are the submodels of NFW.

10.1. Case

By putting in (4) and (5), we get exactly the new probability distribution with the baseline CDF of the exponential distribution called new exponential distribution (NFE). The mathematical forms of the CDF and PDF are, respectively, given by

10.2. Case

By replacing in (4), and (5), we get exactly the new probability distribution with the baseline CDF of the Rayleigh distribution called new Rayleigh distribution (NFR). The mathematical forms of the CDF and PDF are, respectively, given by

11. Applications

To check the performance of the proposed model, two real data sets with extreme values are considered, one consisting of nonmonotonic and the other monotonic hazard rate shapes. Various goodness of fit measures such as the Akaike information criteria (AIC), Hannan and Quinn information criteria (HQIC), Anderson darling (A), Cramer–von Mises , consistent Akaike information criteria (CAIC), and Bayesian information criteria (BIC) were considered to check the efficiency of the proposed model. Mathematically, these criteria are defined aswhere is the maximized likelihood function and is the random sample, is the MLE, and represents the number of parameters involved in the model.

Generally, the probability model that has the smallest number among these various criteria is judged as the best fitted model.

11.1. Data Set 1: Remission Time of Bladder Cancer Patients

The following data set represents the remission time of 128 bladder cancer patients taken from Aldeni and Famoye [20]. The values of the data set are given as follows:

0.080, 0.200, 0.400, 0.500, 0.510, 0.810, 0.900, 1.050, 1.190, 1.260, 1.350, 1.400, 1.460, 1.760, 2.020, 2.020, 2.070, 2.090, 2.230, 2.260, 2.460, 2.540, 2.620, 2.640, 2.690, 2.690, 2.750, 2.830, 2.870, 3.020, 3.250, 3.310, 3.360, 3.360, 3.480, 3.520, 3.570, 3.640, 3.700, 3.820, 3.880, 4.180, 4.230, 4.260, 4.330, 4.340, 4.400, 4.500, 4.510, 4.870, 4.980, 5.060, 5.090, 5.170, 5.320, 5.320, 5.340, 5.410, 5.410, 5.490, 5.620, 5.710, 5.850, 6.250, 6.540, 6.760, 6.930, 6.940, 6.970, 7.090, 7.260, 7.280, 7.320, 7.390, 7.590, 7.620, 7.630, 7.660, 7.870, 7.930, 8.260, 8.370, 8.530, 8.650, 8.660, 9.020, 9.220, 9.470, 9.740, 10.06, 10.34, 10.66, 10.75, 11.25, 11.64, 11.79, 11.98, 12.02, 12.03, 12.07, 12.63, 13.11, 13.29, 13.80, 14.24, 14.76, 14.77, 14.83, 15.96, 16.62, 17.12, 17.14, 17.36, 18.10, 19.13, 20.28, 21.73, 22.69, 23.63, 25.74, 25.82, 26.31, 32.15, 34.26, 36.66, 43.01, 46.12, and 79.05.

Figure 3 describes the empirical and theoretical plots of the PDF and CDF of NFW. Figure 4 defines the P-P and Q-Q plot, and Figure 5 demonstrates the TTT plot that demonstrates the data follow a nonmonotonic hazard rate shape. Table 2 indicates MLEs, the log-likelihood, and their standard error values. Table 3 describes the goodness of fit measures; the goodness of fit measures for NFW has the smallest values that indicate that the proposed probability model has better performance than the already developed Rayleigh (R), exponential (E), Weibull (W), Weibull exponential (W.E), and Algoharai inverse flexible Weibull (AIFW).

11.2. Data Set 2: Bank Customer Data

The data set of Aldeni et al. [20] for the waiting time of the 100 bank customers is considered with the following values:

0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1, 3.2, 3.3, 3.5, 3.6, 4, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9, 4.9, 5.0, 5.3, 5.5, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.1, 7.1, 7.4, 7.6, 7.7, 8, 8.2, 8.6, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.6, 9.7, 9.8, 10.7, 10.9, 11.0, 11.0, 11.1, 11.2, 11.2, 11.5, 11.9, 12.4, 12.5, 12.9, 13.0, 13.1, 13.3, 13.6, 13.7, 13.9, 14.1, 15.4, 15.4, 17.3, 17.3, 18.1, 18.2, 18.4, 18.9, 19.0, 19.9, 20.6, 21.3, 21.4, 21.9, 23, 27, 31.6, 33.1, and 38.5.

Figure 6 shows the empirical and theoretical CDF and PDF of the bank customer’s data. The red line of the graph illustrates that the line is the best fitted to theoretical data as matched to other distributions. Figure 7 reveals the P-P and Q-Q plot, and Figure 8 presents the TTT plot. Figure 8 clearly demonstrates that the data fit the monotonic hazard rate structure. The MLEs are given in Table 4 including the log-likelihood and standard errors. The values in Table 5 show that the NFW performs better as compared to R, E, , W. E, and AIFW distributions.

12. Simulations

To check the performance of the proposed probability model, an expression (13) was applied on the NFW distribution so that to produce artificial data. The simulations are carried out 1000 times for a various samples of size n with different sets of parameters. Table 6 provides the standard errors and their maximum likelihood estimates. The table clearly shows that as the sample size increases, both the standard errors and ML estimates decrease. The general function for calculating the bias and mean square error are given by

13. Conclusion and Discussion

The paper focuses on generating a new family of probability distributions called New Flexible Family of distributions or in short NFF. The proposed family of distribution is applied to a Weibull distribution called New Flexible Weibull (NFW) distribution. Various statistical properties including the entropy measures, order statistics, and moment generating function are determined. The parameters are estimated by using the usual method called maximum likelihood estimates. To achieve the main objectives of the paper, the proposed probability model is applied to real data sets with extreme values, where one datum follows a monotonic hazard rate and a second data set capturing the nonmonotonicity of the hazard rate function. Moreover, the significance of parameters is investigated using a simulation study and it has been shown that the parameters lead to a flexible results. Hence, it has been established that NFW produces a superior fit when modeling data sets with extreme values as opposed to the existing distributions to these data sets. A future research study should also be conducted to extend the proposed model flexibility for the extreme value data sets via transmutation techniques and some others.

Data Availability

The data sets are taken from the literature.

Conflicts of Interest

The authors have no conflicts of interest.