Abstract
In this study, we propose a four-parameter probability distribution called the harmonic mixture Fréchet. Some useful expansions and statistical properties such as moments, incomplete moments, quantile functions, entropy, mean deviation, median deviation, mean residual life, moment-generating function, and stress-strength reliability are presented. Estimators for the parameters of the harmonic mixture Fréchet distribution are derived using the estimation techniques such as the maximum-likelihood estimation, the ordinary least-squares estimation, the weighted least-squares estimation, the Cramér–von Mises estimation, and the Anderson–Darling estimation. A simulation study was conducted to assess the biases and mean square errors of the estimators. The new distribution was applied to three-lifetime datasets and compared with the classical Fréchet distribution and eight (8) other extensions of the Fréchet distribution.
1. Introduction
Mixture distributions have turned out to be a very flexible and increasingly common class of distributions over the last two decades. They have been applied to lifetime data in many reliability and survival analysis. Whether a sample is homogeneous or heterogeneous, the statistical analysis of lifetime datasets is a momentous task in the fields of reliability engineering and survival analysis. One of the four often used extreme value distributions (EVDs) is the Fréchet distribution. The distribution, which is also known as the EVD type II, is the inverse of the Weibull distribution. Extreme events such as annual rainfall, earthquakes, and floods are modelled using it. The probability density function (PDF) of the Fréchet distribution has a unimodal shape or a decreasing shape, which depends on the shape parameter, while its failure rate function exhibits a unimodal shape always [1]. Several extensions of the Fréchet distribution have been proposed in the literature aimed at making it more flexible in modelling both monotonic and non-monotonic datasets. Some extensions of the Fréchet distribution include the Burr X Fréchet (BRXFR) [2], the odd Lomax Fréchet (OLXF) [3], the Poisson–Fréchet (POF) [4], the new exponential-X Fréchet (NEXF) [5], the Weibull Fréchet (WFR) [6], extended Poisson Fréchet distribution (P-BX-Fr) [7], the Burr XII Fréchet (BrXIIFr) [8], the modified Fréchet–Rayleigh distribution (MFRD) [9], truncated Weibull Fréchet distribution (TWFr) [10, 11], the Marshall–Olkin Fréchet distribution (MOF) [12], the gamma extended Fréchet distribution (GEF) [13], the Lehmann type II Fréchet Poisson distribution (LFP) [14], the exponential transmuted Fréchet distribution (ETF) [15], the modified Fréchet (MF) [16], the generalised truncated Fréchet generated family distributions (TGFr-G) [17], and the double truncated transmuted Fréchet distribution (DTTF) [18]. Not long ago, [19] proposed a new family of mixture distribution using the weighted harmonic means of two survival functions and called it the harmonic mixture-G (HMG) family. According to [19], the PDF and cumulative distribution function (CDF), respectively, are given as follows:
Then,where is the survival function of the baseline distribution.
Subsequently, the focus of this study was to develop an extension of the Fréchet distribution using the HMG family of distributions to widen its flexibility in analysing different types of real datasets. The PDF and CDF of the Fréchet distribution are, respectively, given as follows:andwhere is a shape parameter and is a scale parameter.
The corresponding survival function of the Fréchet distribution is given as follows:
Our motivation for this study includes the following:(i)Developing a heavy-tailed distribution that models lifetime datasets(ii)Developing a distribution whose probability densities exhibit a left- or right-skewed shape, a reversed J shape, or a J shape(iii)Providing a distribution that consistently offers better fits to lifetime datasets than those of other generalised distributions with the same underlying model (Fréchet distribution)(iv)Proposing a distribution that can model lifetime datasets with monotonic or non-monotonic failure rates
The remaining parts of the article are organised as follows: PDF, CDF, and failure rate function of the harmonic mixture Fréchet (HMF) distribution alongside their corresponding graphical representations are presented in Section 2. In Section 3, we present some statistical properties of the distribution. The maximum-likelihood estimation, the ordinary least-squares estimation, the weighted least-squares estimation, the Cramér–von Mises estimation, and the Anderson–Darling estimation of the HMF parameters are developed in Section 4. Section 5 presents the simulation results to assess the performance of the estimators of the HMF distribution. Section 6: three applications to real datasets are illustrated to ascertain the importance of the proposed model. Lastly, the conclusions of the study are reported in section 7.
2. Harmonic Mixture Fréchet Distribution
The PDF of the HMF distribution is obtained by substituting equations (3) and (5) into equation (1). The PDF of the HMF distribution is given as follows:where and are shape parameters, is a scale parameter, and and .
Figure 1 shows the density plot of the HMF distribution. Varying the values of the parameters, the density exhibited various kinds of shapes. The PDF of the HMF can be left-skewed, right-skewed with different degrees of kurtosis, J shape, or reversed J shape.
The corresponding CDF of the HMF distribution is obtained by substituting equation (5) into (2). The CDF of the HMF distribution is given as follows:
The failure rate function of the HMF distribution is given as follows:
Figure 2 shows the plot of the failure rate function of HMF distribution. For various values of the parameters, the plots exhibited various desirable shapes, such as decreasing, increasing, and unimodal.
Lemma 1. The PDF of the HMF distribution, expressed in mixture form, is given as follows:where ,,,, and.
Proof. For any real non-integer , the series expansion for , for , is as follows:Since , using the series expansion equation in equation (10) twice, we obtainSubstituting equation (11) into equation (6) and applying equation (10) severally yieldHence,The proof is complete.
3. Statistical Properties
The moment-generating function (MGF), stress-strength reliability, entropy, mean deviation, median deviation, mean residual life, and other statistical properties are discussed in this section.
3.1. Quantile Function
The quantile function of a distribution is the inverse of the CDF of the distribution. It also gives us a different way to describe the characteristics and shapes of a distribution.
The quantile function of the HMF distribution can be expressed as follows:where and is the quantile function.
It can be seen that the quantile function of the HMF distribution does not have a closed form. Numerical approximations will therefore be used to approximate the various values of the quantile function.
Galton’s measure of skewness (GS) [20] and the Moors measure of kurtosis (MK) [21], respectively, are defined as follows:
Then,
Table 1 shows the results of the quantile function, Galton’s measure of skewness (GS), and the Moors measure of kurtosis (MK) for various parameter values. The HMF distribution could be either moderately or strongly skewed. For some parameter values, the HMF distribution is positively skewed, while for some other parameter values, the distribution is negatively skewed.
For some parameter values, the HMF distribution is platykurtic, whereas, for others, it is leptokurtic.
3.2. Moments
Moments are essential in statistical analysis, especially in deriving some important measures of statistical distributions [22]. Measures such as mean , variance , coefficient of variation (CV), skewness (CS), and kurtosis (CK) can be obtained using moments. , , CV, CS, and CK, respectively, are defined as follows:
Then,
Proposition 1. The noncentral moment of the HMF distribution is given as follows:
Proof. By definition,Substituting equation (9) into equation (20), we obtainLet , which implies and . When and when , we obtainUsing the identitywe obtainThe proof is complete.
, CV, CS, and CK for the HMF distribution using the noncentral moments for some selected parameter values are shown in Table 2. The HMF distribution could be highly skewed or moderately skewed. For some parameter values, the HMF distribution is positively skewed, while for some other parameter values, the distribution is negatively skewed.
The HMF distribution is platykurtic for some parameter values and leptokurtic for some parameter values.
3.3. Incomplete Moments
The incomplete moments are essential in obtaining the mean deviation and the median deviation.
Proposition 2. The incomplete moment of the HMF distribution is given as follows:where is the upper incomplete gamma function and .
Proof. By definition, the incomplete moment is obtained usingSubstituting equation (9) into equation (26), we haveLet , which implies and . When and when , we obtainUsing the identitywe haveThe proof is complete.
3.4. Mean Deviation and Median Deviation
The total variation that exists in a distribution can be measured using the mean and median deviation.
Proposition 3. The mean deviation of the HMF distribution is given as follows:
Proof. The definition of mean deviation is as follows: can be obtained using the first incomplete moment.
The proof is complete.
Proposition 4. The median deviation of the HMF distribution is given as follows:
Proof. By the definition of median deviation,The proof is complete.
3.5. Mean Residual Life
The mean residual life function at time measures the expected added lifetime that a unit has survived until the time . This function plays a major role in survival or reliability analysis [23].
Proposition 5. The mean residual life function of the HMF distribution is given as follows:
Proof. For a nonnegative random variable X, the mean residual life is given as follows:Hence,Substituting equation (5) and , which can be obtained from the first incomplete moment into equation (37), we obtain the mean life residual function of the HMF distribution.
The proof is complete.
3.6. Moment-Generating Function
The moment-generating function, if it exists, is used to derive the moments of a distribution.
Proposition 6. The moment-generating function of the HMF distribution is given as follows:
Proof. Using the identitywe can define the moment-generating function as follows:Substituting equation (25) into equation (40), we obtain the moment-generating function of the HMF distribution.
The proof is complete.
3.7. Entropy
The entropy of a random variable is used to measure the variation or uncertainty. The lower the entropy, the less the uncertainty and vice versa.
Proposition 7. The Rényi entropy of the HMF distribution is given as follows:where and
Proof. By definition,We obtain the Rényi entropy of the HMF distribution by rearranging and increasing the power of the PDF of the HMF to and following the same procedure used to obtain the moments.
3.8. Stress-Strength Reliability
The extent to which the strength of a system can withstand the stress it is subjected to is measured using the stress-strength reliability [24]. Therefore, the stress-strength reliability can be defined as the probability that a system’s strength is greater than the stress it is subjected to ; thus, . If , the system or component fails.
Proposition 8. If and follow the HMF distribution, then the stress-strength reliability is given as follows:where , , and
Proof. By definition,We then obtain the product of the PDF and survival function given as follows:where .
Substituting equation (47) into equation (46) and letting imply and . When and when , we haveUsing the identity , we obtainThe proof is complete.
4. Estimation of Parameters of HMF Distribution
In this section, we obtain the estimators of the HMF distribution using five estimation methods: the maximum-likelihood estimation (MLE), the ordinary least-squares method (OLS), the weighted least-squares method (WLS), the Cramér–von Mises estimation (CVM) and the Anderson–Darling estimation (ADE).
4.1. Maximum-Likelihood Estimation
The MLE is used to obtain estimates of the unknown parameters by maximising the likelihood function. The likelihood function of the HMF distribution is given as follows:
We obtain the log-likelihood function by substituting equation (6) into (50) and taking the logarithm of the resulting equation. We have
We obtain the MLE of the parameters by differentiating equation (51) with respect to and equating the resulting functions to zero. The resulting functions are as follows:
By equating these functions to zero and solving them simultaneously using numerical methods, we obtain the maximum-likelihood estimates of the unknown parameters.
4.2. Ordinary Least Squares
The OLS estimates of the unknown parameters of the HMF distribution, where are the order statistics of the observed sample, are obtained by minimising the function
We differentiate equation (53) with respect to the various parameters and equate each result obtained to zero to obtainwhere
The OLS estimates are obtained by solving these functions simultaneously using numerical methods.
4.3. Weighted Least Squares
The WLS estimates of the unknown parameters of the HMF distribution, where are the order statistics of the observed sample, are obtained by minimising the function
We differentiate equation (62) with respect to the various parameters and equate each result obtained to zero to obtain
, , can be obtained through equations (58)–(61).
The WLS estimates are obtained by solving these functions simultaneously by employing numerical methods.
4.4. Cramér–Von Mises Estimation
The CVM estimates of the unknown parameters of the HMF distribution, where are order statistics of the observed sample, are obtained by minimising the function
We differentiate equation (64) with respect to the various parameters and equate each result obtained to zero to obtain
, , can be obtained through equations (58)–(61).
The CVM estimates are obtained by solving these functions simultaneously by employing numerical methods.
4.5. Anderson–Darling Estimation
The ADE estimates of the unknown parameters of the HMF distribution, where are the order statistics of the observed sample, are obtained by minimising the function
We differentiate equation (66) with respect to the various parameters and equate each result obtained to zero to obtainwhere , , can be derived from the equations (58)–(61).
The ADE estimates are derived by solving these functions simultaneously by employing numerical methods.
5. Monte Carlo Simulation
In this section, we perform a simulation study to assess the performance of the estimators for the parameters of the HMF distribution. Three different sets of parameter values are used together with the quantile function. The experiment is replicated one thousand times for each sample size . The average biases (ABs) and the mean square errors (MSEs) of the MLE, OLS, WLS, CVM, and ADE are shown in Tables 3–5.
The ABs and MSEs were computed using the relations as follows:
Then,
Table 3 shows the AB and MSE of the MLE, OLS, WLS, CVM, and ADE of for . The ABs and MSEs for the estimators of the parameters decrease as the sample size increases despite a few fluctuations. The MLE estimators recorded the least ABs and MSEs and thus could be considered the best estimator.
Table 4 shows the ABs and MSEs of the MLE, OLS, WLS, CVM, and ADE of for . The ABs and MSEs for the estimators of the unknown parameters decrease as the sample size increases. The MLE, however, recorded the least ABs and MSEs and was consistent, thus could be adjudged the best estimator.
Table 5 shows the ABs and MSEs of the MLE, OLS, WLS, CVM, and ADE of for . The ABs and MSEs for the estimators of the unknown parameters in the first and second cases showed decreasing patterns. The MLE again recorded the least ABs and MSEs and was consistent; thus, it could be adjudged the best estimator.
Based on rankings (the least ABEs and MSEs to the greatest ABEs and MSEs) in Tables 3–5, the MLE is the best estimator of the parameters of the HMF distribution.
6. Applications
In this section, the HMF distribution is applied to three datasets to ascertain its versatility. These datasets include the annual maximum temperature of a location in the Upper East Region, Ghana (this region provides the relatively highest annual temperature values) (1970–2020), the annual unemployment rate in Ghana (1991–2021), and the survival times data of 128 bladder cancer patients. The annual maximum temperature data in degrees Celsius () used in the analysis were generated from (https://www.globalclimatemonitor.org/) using latitude (10.9922) and longitude (−1.1133). The data are 27.87, 27.42, 27.7, 28.15, 27.27, 27.2, 27.14, 27.74, 27.48, 27.67, 27.82, 28.3, 27.91, 28.35, 28.26, 28.45, 28.38, 29.15, 28.57, 28.21, 28.9, 28.1, 27.65, 28.38, 27.94, 28.2, 28.28, 28.32, 28.71, 28.27, 28.2, 28.6, 28.49, 28.6, 28.55, 28.88, 28.5, 28.35, 28.17, 28.59, 28.8, 28.65, 28.51, 28.26, 28.5, 28.6, 28.59, 28.37, 28.35, 28.8, and 28.7.
The annual unemployment rate data were retrieved from (https://data.worldbank.org/country/GH). The data are 3.49, 4.70, 5.27, 5.86, 6.44, 7.02, 7.61, 8.20, 10.10, 10.46, 9.50, 8.53, 7.56, 6.59, 5.62, 4.64, 4.84, 4.99, 5.22, 5.38, 5.60, 5.91, 6.20, 6.52, 6.81, 5.53, 4.22, 4.28, 4.32, 4.65, and 4.70.
The bladder cancer data were used by [25, 26]. The data are 0.08, 6.97, 2.46, 9.74, 3.88, 15.96, 4.26, 79.05, 11.79, 8.37, 12.07, 2.09, 9.02, 3.64, 14.76, 5.32, 36.66, 5.41, 1.35, 18.1, 12.02, 21.73, 3.48, 13.29, 5.09, 26.31, 7.39, 1.05, 7.63, 2.87, 1.46, 2.02, 2.07, 4.87, 0.4, 7.26, 0.81, 10.34, 2.69, 17.12, 5.62, 4.4, 3.31, 3.36, 6.94, 2.26, 9.47, 2.62, 14.83, 4.23, 46.12, 7.87, 5.85, 4.51, 6.93, 8.66, 3.57, 14.24, 3.82, 34.26, 5.41, 1.26, 11.64, 8.26, 6.54, 8.65, 13.11, 5.06, 25.82, 5.32, 0.9, 7.62, 2.83, 17.36, 11.98, 8.53, 12.63, 23.63, 7.09, 0.51, 7.32, 2.69, 10.75, 4.33, 1.4, 19.13, 12.03, 22.69, 0.2, 9.22, 2.54, 10.06, 4.18, 16.62, 5.49, 3.02, 1.76, 20.28, 2.23, 13.8, 3.7, 14.77, 5.34, 43.01, 7.66, 4.34, 3.25, 2.02, 3.52, 25.74, 5.17, 32.15, 7.59, 1.19, 11.25, 5.71, 4.5, 3.36, 4.98, 0.5, 7.28, 2.64, 10.66, 2.75, 17.14, 7.93, 6.25, and 6.76.
The performance of the HMF distribution is compared with the classical Fréchet distribution and eight (8) modifications of the classical Fréchet distribution. These eight (8) distributions include the Burr X Fréchet (BRXFR) [2], the odd Lomax Fréchet (OLXF) [3], the Poisson–Fréchet (POF) [4], the new exponential-X Fréchet (NEXF) [5], the Weibull Fréchet (WFR) [6], the modified Fréchet–Rayleigh distribution (MFRD) [9], the Marshall–Olkin Fréchet distribution (MOF) [12], and the modified Fréchet (MF) [16].
The Anderson–Darling (AD), Kolmogorov–Smirnov (K-S), and Cramér–von Mises (CVM) tests were employed to assess the goodness of fit of the selected distributions.
The distribution with the lowest Akaike information criterion (AIC), consistent Akaike information criterion (CAIC), and Bayesian information criterion (BIC) is considered the most appropriate model for the datasets. The AIC, CAIC, and BIC, respectively, are obtained using
Then,
6.1. Annual Maximum Temperature
As shown in Table 6, the least annual maximum temperature value for the location selected was 27.14, while the greatest value was 29.15. The value of the coefficient of skewness is −0.72 and that of the coefficient of kurtosis is −0.13. The annual maximum temperature dataset is negatively skewed and less peaked than the normal curve, thus platykurtic.
The MLEs for the models fitted and their standard errors are shown in Table 7. and for OLXF, for BRXFR, for NEXF, for POF, for WFR, and for MOF were not significant at level of significance, while all others in their respective models were significant at significance level.
The HMF model gives a better fit to the annual maximum temperature dataset than the other nine (9) competing models. As shown in Table 8, the HMF model had the highest log-likelihood value and the lowest AIC, CAIC, and BIC values compared with the other competing models. Additionally, the HMF model had the smallest AD, K-S, and CVM values.
The fitted PDFs and CDFs of the models are, respectively, presented in Figures 3 and 4. It can be seen that the HMF model fits the annual maximum temperature dataset better.
6.2. Ghana Annual Unemployment Rate Data
In Table 9, the least unemployment rate value in Ghana (1991–2021) was 3.49, while the greatest value was 10.46. The value of the coefficient of skewness is 0.96 and that of the coefficient of kurtosis is 0.36. This shows that the annual unemployment rate dataset is positively skewed and also less peaked than the normal curve, thus platykurtic.
The MLEs for the models fitted and their standard errors are shown in Table 10. and for HMF, , , and for OLXF, and for BRXFR, for NEXF, for POF, , and for WFR, for MFRD, and for MOF were not significant at level of significance, while all others in their respective models were significant at significance level.
The HMF model gives a better fit to the annual unemployment dataset than the other nine (9) competing models. As shown in Table 11, the HMF model had the highest log-likelihood value and the lowest AIC, CAIC, and BIC values compared with the other competing models. Also, the HMF model had the smallest AD, K-S, and CVM values.
The fitted PDFs and CDFs of the models are, respectively, presented in Figures 5 and 6. As shown in Figures 3 and 6, the HMF model fits the annual unemployment rate dataset better. The OLXF, BRFXR, NEXF, POF, WFR, MF, and MOF models are alternatively good for fitting the dataset as their goodness-of-fit values are closer to that of the HMF distribution.
6.3. Bladder Cancer Survival Time
In Table 12, the least survival time value was 0.08, while the greatest value was 79.05. The value of the coefficient of skewness is 3.33 and that of the coefficient of kurtosis is 16.15. The survival time dataset is then highly positively skewed and more peaked than the normal curve, thus leptokurtic.
The MLEs for the models fitted and their standard errors are shown in Table 13. , , and for HMF, , , and for OLXF, for BRXFR, and and for WFR were not significant at level of significance, while all others in their respective models were significant at significance level.
The HMF model gives a better fit to the bladder cancer dataset than the other nine (9) competing models. As shown in Table 14, the HMF model had the highest log-likelihood value and the lowest AIC, CAIC, and BIC values compared with the other competing models. Also, the HMF model had the smallest AD, K-S, and CVM values.
The fitted PDFs and CDFs of the models are, respectively, presented in Figures 7 and 8. As shown in Figures 7 and 8, the HMF model fits the bladder cancer survival time dataset better. The BRXFR, NEXF, WFR, and MF models are alternatively good in fitting the bladder remission time dataset as their goodness-of-fit values are closer to that of the HMF distribution.
7. Conclusion
The four-parameter harmonic mixture Fréchet distribution called the HMF distribution is presented and studied in detail. The failure rate function of the HMF distribution can be monotonically increasing, monotonically decreasing, or upside-down bathtub for a different combination of the parameter values. Some statistical properties such as moments, incomplete moments, quantile functions, entropy, mean deviation, median deviation, mean residual life, moment-generating function (MGF), and stress-strength reliability are presented.
The maximum-likelihood estimation, the ordinary least-squares estimation, the weighted least-squares estimation, the Cramér–von Mises estimation, and the Anderson–Darling were used to estimate the parameters of the model. The results indicate that the maximum-likelihood estimator is the better estimator.
The new distribution was applied to three-lifetime datasets and compared with the classical Fréchet distribution and eight (8) other extensions of the Fréchet distribution and was found to provide a better fit.
We are committed to providing a detailed Bayesian study for the four-parameter HMF distribution in the future.
Data Availability
The study is aimed at improving methodologies, and the data used have been duly cited within the manuscript.
Conflicts of Interest
The authors declare that they have no conflicts of interest.