A Flexible Extension of Pareto Distribution: Properties and Applications

Alshanbari, Huda M.; Al-Aziz Hosni El-Bagoury, Abd; Gemeay, Ahmed M.; Hafez, E. H.; Eldeeb, Ahmed Sedky

doi:https://doi.org/10.1155/2021/9819200

Computational Intelligence and Neuroscience

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Artificial Intelligence and Machine Learning-Driven Decision-Making

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 9819200 | https://doi.org/10.1155/2021/9819200

A Flexible Extension of Pareto Distribution: Properties and Applications

Huda M. Alshanbari,¹Abd Al-Aziz Hosni El-Bagoury,²Ahmed M. Gemeay,²E. H. Hafez,³and Ahmed Sedky Eldeeb^4,5

Academic Editor: Ahmed Mostafa Khalil

Received21 Jun 2021

Accepted26 Jul 2021

Published17 Aug 2021

Abstract

This paper introduced a relatively new mixture distribution that results from a mixture of Fréchet–Weibull and Pareto distributions. Some properties of the new statistical model were derived, such as moments with their related measures, moment generating function, mean residual life function, and mean deviation. Furthermore , different estimation methods were introduced for determining the unknown parameters of the proposed model. Finally, we introduced three real data sets which were applied to our distribution and compared them with other well-known statistical competitive models to show the superiority of our model for fitting the three real data sets, and we can clearly see that our distribution outperforms its competitors. Also, to verify our results, we carried out the existence and uniqueness test to the log-likelihood to determine whether the roots are global maximum or not.

1. Introduction

Modeling new phenomena is very important in the field of big data and data science. There are many ways for modeling and representing data. One of these ways is the statistical modeling for real data sets. Statistical modeling is very important in real-life sciences, as many applications and phenomena appear every era of time, so the continuous need for new distribution grows larger. As we know, many of the phenomena that arise nowadays need modeling, but, unfortunately, the traditional distribution could not model them. So, sometimes researchers turn to add new parameters, may be two parameters, to overcome these deficiencies in modeling new phonemes. But there is a new way to overcome all the deficiencies in the traditional distribution.

This method is formulated by making a mixture from two or three distributions to formulate a new superior that can model all the data that the traditional ones failed to model. Many authors worked on the Pareto distribution; see [1], where the authors worked on the Pareto-IV distribution and estimated its parameters under accelerated life test, when the items were under type-II censored sample. Also, as an example for authors that worked on the Fréchet distribution, see [2], where the authors estimated the parameters of the Fréchet distribution under type -II censoring scheme using classical and Bayesian estimation methods.

Mixture distribution may appropriately be utilized for specific data set where various subsets of the entire data set have various properties that can best be demonstrated independently. They can be more mathematically manageable because the individual mixture components deal with that more nicely compared to the overall mixture density. Applications of the mixture of distributions play an important role in reliability theory, insurance risk theory, and the oil industry. Willmot [3] presented the asymptotic tail behavior of Poisson mixtures with applications. Giudici et al. [4] made a novel methodology, dependent on mixtures of the product of Dirichlet process priors, which gave a formal inferential device to think about the logical influence of each covariate.

Without characterizing the system, Bucar et al. [5] demonstrated that the reliability of this system could be approximated by utilizing a finite Weibull mixture distribution. Nakhi and Kalla [6] discussed the mixture of hyper-Poisson distribution with mixing a generalized gamma distribution and hyper-Poisson distribution generalized gamma mixtures.

Panjer and Willmot [7] discover the estimator of the scale parameter in mixture models and the inadmissibility of the unusual estimator set up by displaying better estimators. They used these outcomes in mixtures of normal distributions and mixtures of exponential distributions. Karim et al. [8] introduced Rayleigh mixture distribution with various weight functions, and two correlated Rayleigh random variables have been determined.

By presuming that the random variable X has a mixture of distributions if at least one parameter of the distribution of X is also a random variable. Let be probability density function (PDF) of X, where is a parameter of the distribution of X. If is a random variable, then X has a mixture of distributions. The PDF of X is defined as

Extreme point distributions have developed as one of the most important statistical fields for the applied sciences. Techniques of extreme point are also becoming heavily utilized in many other fields. Extreme point analyses often involve estimate of the likelihood of occurrences which are more extreme than any previously recorded event. Fréchet and Weibull distributions are the most important models for extreme values, and many statisticians have studied these models in many published papers according to their importance in many fields such as earthquakes, floods, engineering, physics, quality control, and medicine. For more information about Fréchet and Weibull distributions, see [1, 2, 9, 10].

So, the main concern of this research is that we derive a mixture distribution called Fréchet–Weibull mixture Pareto distribution (FWMPD) from mixing Fréchet–Weibull distribution with Pareto distribution. This new mixture has a lot of significant advantages, which are very flexible and versatile. This distribution can model skewed and symmetric as well as asymmetric data. Now we will introduce the concept that we based our proposed distribution on.

In order to make the paper easier for the reader, we sectioned and written the paper as follows: In Section 2, we introduce the proposed distribution and the steps to formulate it. In Section 3, we deduce some of the statistical properties of the proposed distribution mathematically. In Section 4, we introduce eight different classical methods for estimating unknown parameters of the proposed model. For more about different kinds of classical methods of estimation, see [11–14] and [15]. In Section 5, we introduce three real data sets as an application to assess the performance of the distribution and to show its efficiency for fitting different real data sets. In Section 6, we introduce the conclusions illustrated from the paper along with the major findings.

2. The Mixture of Fréchet–Weibull and Pareto Distributions

The formulation of the new mixture model is presented in this part of the paper . The PDF and the cumulative distribution function (CDF) of the Fréchet–Weibull distribution [16] () is represented as follows:where and are shape parameters and and are scale parameters.

The PDF and the CDF for the Pareto random variable are, respectively, given bywhere is a scale parameter and is a shape parameter.

If a random variable X follows Fréchet–Weibull distribution and by taking one of its four parameters () as a random variable following Pareto distribution, then it is said to have FWMPD when its PDF and CDF are, respectively, defined as follows:where , and are shape parameters, and are scale parameters, and is upper incomplete gamma function.

2.1. Survival and Hazard Functions

The characteristics dependent on the reliability function and its correlated functions are very useful to study the example of any lifetime phenomenon. The survival function [], hazard function [], and reverse hazard function [] of FWMPD are defined as follows:where is upper incomplete gamma function.

2.2. Asymptotic Behavior

This section contains studies on the behaviors of PDF, CDF, and of FWMPD at and , respectively, as follows:and since , we have

2.3. Impact of Changing Parameters Values

In this section, we display the impact of changing parameters values on drawing PDF, CDF, , and of FWMPD, which are graphed and plotted in Figures 1–4.

Figure 1(a) explains how the behavior of PDF of FWMPD is affected by increasing the value of parameter , where , , , and , and Figure 1(b) explains how its behavior is affected by increasing the value of parameter , where , , , and .

(a)

(b)

Figure 2(a) shows how the behavior of CDF is changed when the significance increasing happened of the parameter , as we can see this effect very clearly from the graph, where , , , and , and Figure 2(b) shows how the behavior of CDF is affected by increasing the value of parameter , where , , , and .

(a)

(b)

Figure 3(a) shows how the behavior of is changed when the significance increasing happened of the parameter , as we can see this effect very clearly from the graph, where , , , and are still fixed, and Figure 3(b) shows how the behavior of is affected with changing the value of parameter , where , , , and .

(a)

(b)

Figure 4(a) shows how the behavior of is changed when the significanceincreasing happened of the parameter , as we can see this effect very clearly from the graph, where , , , and , and Figure 4(b) shows how the behavior of is affected by the change of parameter , where , , , and .

(a)

(b)

3. Statistical Properties

In this part of the paper, we introduce the mathematical properties for the proposed distribution. These properties are the moments, moment generating function, mean residual life function, and the mean deviation of the proposed distribution.

3.1. Moments

In this subsection, we present the moments of the proposed distribution. Now let be the about the origin of FWMPD and it is defined as follows:

By setting r = 1, 2, 3, and 4, we can get so easily the first four moments by assigning FWMPD, respectively. Therefore, the mean and variance of FWMPD are given byrespectively, and, by using the moments about the origin, we can determine the first four central moments about the mean of FWMPD, which are given by the following relations:respectively, which will be used to determine coefficients of skewness, kurtosis, and variation, respectively, as follows:

3.2. Moment Generating Function

The moment generating function of FWMPD is given byand its characteristic function is given by

3.3. Mean Residual Life Function

The mean residual life function of a continuous random variable X and survival function following FWMPD is given bywhere and denote upper and lower in complete gamma function, respectively. We can notice thatwhich is an important property for .

3.4. Mean Deviation

The mean deviation about the mean for FWMPD is given bywhere and indicate upper and lower incomplete gamma function, respectively, and, by changing with any measure of central tendency, we can find its mean deviation.

4. Classical Methods of Estimation

This section discusses the conventional techniques for estimating the suggested model parameters by eight different classical estimation methods. Many papers discussed these methods (for more information, see [17–21]). Determining the estimated parameters in explicit form is mathematically complicated, so these estimates will be obtained numerically by using Wolfram Mathematica software version 12.0.

4.1. Classical Methods for the Complete Sample

In this subsection, we introduce eight methods of estimation which were used for estimating the parameters of the proposed distribution.

4.1.1. Maximum Likelihood Estimates (MLEs)

Let is a randomized sample having a size from the PDF (3). So, the log-likelihood function for is as follows:

The MLEs of can be obtained by maximizing .

4.1.2. Ordinary Least-Squares Estimates (OLSEs)

Let be the corresponding order statistics. The OLSEs for the distribution parameters can be easily obtained by making the following equation at minimum value, by using any mathematical software, may be MATHEMATICA 12 or any advanced program.

4.1.3. Weighted Least-Squares Estimates (WLSEs)

By minimizing the following equation, the WLSEs of proposed model parameters can be computed:

4.1.4. Anderson–Darling Estimates (ADEs), Right-Tail Anderson–Darling Estimates (RTADEs), and Left-Tail Anderson–Darling Estimates (LTADEs)

The ADEs for the distribution parameters can be easily obtained by making the following equation at minimum value, by using any mathematical software, may be MATHEMATICA 12 or any advanced program.

The RTADEs for the distribution parameters can be easily obtained by making the following equation at minimum value, by using any mathematical software, may be MATHEMATICA 12 or any advanced program.

The LTADEs for the distribution parameters can be easily obtaine by making the following equation at minimum value, by using any mathematical software, may be MATHEMATICA 12 or any advanced program.

4.1.5. Cramér–von Mises Estimates (CVMEs) and Maximum Product of Spacing Estimates (MPSEs)

The CVMEs are determined by minimizing

The MPSEs are determining by maximizing the following equation:where , , , and .

5. Modeling Real Data Sets

This section discusses the flexibility of the proposed model for fitting three real-world data sets and compares it with other well-known competing models. The three analyzed data sets are used to show the flexibility of FWMED as we used very common distribution for comparison that they are known by their flexibility such as Fréchet–Weibull mixture exponential distribution (FWMED) [22], Weibull distribution (WD), exponential distribution (ED), gamma distribution (GD), and inverse Pareto distribution (IPD) [23].

The competing distributions are compared using goodness-of-fit measures, including Anderson–Darling (AD), Cramér–von Mises (CM), and Kolmogorov–Smirnov (KS) with its value (KS- value).

To evaluate the validity of competing models, the MLEs method is used for esitimatingthe parameters of the competing models, and the analytical measurements are generated using the Wolfram Mathematica version 12 program.

5.1. Data Set I

This real data set is for the relief times of twenty patients taking a acertain kind of medecine called analgesic. These data were introduced by Clark and Gross [24], page 105, and they are given in Table 1.

Table 2 provides the analytical measures along with MLEs. The fitted PDF, CDF, SF, and P-P plots of the FWMPD model for the first data set are depicted in Figure 5. The results in Table 2 show that the FWMPD distribution is the best fit one compared to other models that are comparable for the first data set. Figure 6 provides profile-likelihood plots of the FWMPD parameters for the first real data set. These plots illustrate the unimodality of profile-likelihood functions for all estimated parameters. Table 3 presents the values of estimates, negative log-likelihood function, CM, AD, KS, and KSP of the proposed model for the eight different estimation methods. Figure 7 displays P-P plots for the proposed model by using different estimation methods along with fitted PDFs by results of these methods.

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)

(e)

(a)

(b)

5.2. Data Set II

This data set was taken from McCool, and it represents the fatigue lifetime in hours for 10 bearings of certain types. It was studied by Wu and Wong [25], and it is given in Table 4.

Table 5 provides the analytical measures along with ML estimates. The fitted PDF, CDF, SF, and P-P plots of the FWMPD model for the second data set are depicted in Figure 8. The results in Table 5 show that the FWMPD distribution is the best fit compared to other models that are comparable for the second data set. Figure 9 provides the profile-likelihood plots of the FWMPD parameters for the second real data set. These plots illustrate the unimodality of profile-likelihood functions for all estimated parameters. Table 6 presents the values of estimates, negative log-likelihood function, CM, AD, KS, and KSP of the proposed model for the eight different estimation methods. Figure 10 displays the P-P plots for the proposed model by using different estimation methods along with fitted PDFs by results of these methods.

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)

(e)

(a)

(b)

5.3. Data Set III

This data set indicates the survival times for head neck cancer of 45 patients; we consider this data set as a complete one. For more details about the data, see [26]. This data set is given in Table 7. Table 8 provides the analytical measures along with ML estimates. The fitted PDF, CDF, SF, and P-P plots of the FWMPD model for the third data set are depicted in Figure 11. The results in Table 8 show that the FWMPD distribution is the best fit compared to other models that are comparable for the third data set. Figure 12 provides profile-likelihood plots of the FWMPD parameters for the third real data set. These plots illustrate the unimodality of profile-likelihood functions for all estimated parameters. Table 9 presents the values of estimates, negative log-likelihood function, CM, AD, KS, and KSP of the proposed model for the eight different estimation methods. Figure 13 displays P-P plots for the proposed model by using different estimation methods along with fitted PDFs by results of these methods.

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)

(e)

(a)

(b)

5.4. Concluding Remarks on the Results of the Real Data Sets

(1)Regarding the data sets in Tables 1, 4, and, 7, we applied three data sets to the proposed distribution, and we deduced that the distribution outperforms all its competitors.(2)Referring to the values of the KS, AD, and CM, we can deduce that the proposed distribution has the least measures, and this assures its superiority.(3)Referring to the values of the distribution, we can deduce that the proposed distribution has the highest value, and this assures its superiority.(4)Figures 6, 9, and 12 provide profile-likelihood plots of the FWMPD parameters for the three real data sets, respectively. These plots illustrate that the estimated parameters give a maximum value of the log-likelihood function, and these estimates are global maximum estimates.

6. Conclusion and Major Findings

In this article, we introduced a new mixture of distribution FWMPD, and we estimated its parameters by the classical methods of estimation: the maximum likelihood estimation and 7 other methods. We introduced its mathematical properties and graphed its PDF and CDF to study its behavior under different values of estimates. Last but not least, we made an application on the proposed distribution to assure its superiority compared to its competitors. We evaluated its KS, AD, CM, and value, and we deduced that it has the lowest values for KS, AD, and CM and the greatest values for values which make it a better candidate among all its competitors. Also, to make sure that the roots for the MLEs for the proposed distribution give a maximum value, we graphed Figures 6, 9, and 12 for the profile-likelihood function of the proposed model with its parameters for the three real data sets, respectively. These plots illustrate the unimodality of profile-likelihood functions for all estimated parameters. We expect that the presented model will find a broader range of applications in fields like engineering, survival and lifespan data, meteorology, hydrology, and economics.

Data Availability

All data are included within the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.

References

A. M. Abd El-Raheem, M. H. Abu-Moussa, M. M. M. El-Din, and E. H. Hafez, “Accelerated life tests under sareto-iv lifetime distribution: real data application and simulation study,” Mathematics, vol. 8, no. 10, p. 1786, 2020.
View at: Publisher Site | Google Scholar
F. H. Riad and E. H. Hafez, “Point and interval estimation for frechet distribution based on progressive first failure censored data,” Journal of Statistics Applications & Probability, vol. 9, no. 1, pp. 181–191, 2020.
View at: Publisher Site | Google Scholar
G. E. Willmot, “Asymptotic tail behaviour of Poisson mixtures by applications,” Advances in Applied Probability, vol. 22, no. 1, pp. 147–159, 1990.
View at: Publisher Site | Google Scholar
P. Giudici, M. Mezzetti, and P. Muliere, “Mixtures of products of dirichlet processes for variable selection in survival analysis,” Journal of Statistical Planning and Inference, vol. 111, no. 1-2, pp. 101–115, 2003.
View at: Publisher Site | Google Scholar
T. Bučar, M. Nagode, and M. Fajdiga, “Reliability approximation using finite weibull mixture distributions,” Reliability Engineering & System Safety, vol. 84, no. 3, pp. 241–251, 2004.
View at: Publisher Site | Google Scholar
Y. B. Nakhi and S. L. Kalla, “On a generalized mixture distribution,” Applied Mathematics and Computation, vol. 169, no. 2, pp. 943–952, 2005.
View at: Publisher Site | Google Scholar
H. H. Panjer and G. E. Willmot, “Finite sum evaluation of the negative binomial-exponential model,” ASTIN Bulletin, vol. 12, no. 2, pp. 133–137, 1981.
View at: Publisher Site | Google Scholar
R. Karim, P. Hossain, S. Begum, and F. Hossain, “Rayleigh mixture distribution,” Journal of Applied Mathematics, vol. 2011, Article ID 238290, 17 pages, 2011.
View at: Publisher Site | Google Scholar
M. M. M. El-Din, M. M. Amein, A. M. Abd El-Raheem, E. H. Hafez, and F. H. Riad, “Bayesian inference on progressive-stress accelerated life testing for the exponentiated weibull distribution under progressive type-ii censoring,” Journal of Statistics Applications & Probability Letters, vol. 7, no. 3, pp. 109–126, 2020.
View at: Publisher Site | Google Scholar
A. E.-M. A. M. Teamah, A. A. Elbanna, and A. M. Gemeay, “Right truncated fréchet-weibull distribution: statistical properties and application,” Delta Journal of Science, vol. 41, no. 1, pp. 20–29, 2020.
View at: Publisher Site | Google Scholar
H. M. Aljohani, E. M. Almetwally, A. S. Alghamdi, and E. H. Hafez, “Ranked set sampling with application of modified kies exponential distribution,” Alexandria Engineering Journal, vol. 60, no. 4, pp. 4041–4046, 2021.
View at: Publisher Site | Google Scholar
M. M. M. El-Din, M. M. Amein, and E. H. Hafez, “Statistical inference and characterizations from independent and identical exponential-Bernoulli mixture distribution,” Journal of Advanced Research in Statistics and Probability, vol. 3, no. 3, pp. 15–31, 2011.
View at: Google Scholar
Y. L. Tung, Z. Ahmad, O. Kharazmi, C. B. Ampadu, E. H. Hafez, and S. A. M. Mubarak, “On a new modification of the weibull model with classical and bayesian analysis,” Complexity, vol. 2021, Article ID 5574112, 19 pages, 2021.
View at: Publisher Site | Google Scholar
W. Wang, Z. Ahmad, O. Kharazmi, C. B. Ampadu, E. H. Hafez, and M. M. Mohie El-Din, “New generalized-x family: modeling the reliability engineering applications,” Plos One, vol. 16, no. 3, Article ID e0248312, 2021.
View at: Publisher Site | Google Scholar
H. S. Mohammed, Z. Ahmad, A. T. Abdulrahman et al., “Statistical modelling for bladder cancer disease using the NLT-W distribution,” AIMS Mathematics, vol. 6, no. 9, pp. 9262–9276, 2021.
View at: Publisher Site | Google Scholar
A. A. M. Teamah, A. A. Elbanna, and A. M. Gemeay, “Fréchet-weibull distribution with applications to earthquakes data sets,” Pakistan Journal of Statistics, vol. 36, no. 2, 2020.
View at: Google Scholar
A. Z. Afify, A. M. Gemeay, and N. A. Ibrahim, “The heavy-tailed exponential distribution: risk measures, estimation, and application to actuarial data,” Mathematics, vol. 8, no. 8, p. 1276, 2020.
View at: Publisher Site | Google Scholar
A. A. Al-Babtain, I. Elbatal, H. Al-Mofleh, A. M. Gemeay, A. Z. Afify, and A. M. Sarg, “The flexible burr X-G family: properties, inference, and applications in engineering science,” Symmetry, vol. 13, no. 3, p. 474, 2021.
View at: Publisher Site | Google Scholar
A. A. Al-Babtain, A. M. Gemeay, and A. Z. Afify, “Estimation methods for the discrete Poisson-lindley and discrete lindley distributions with actuarial measures and applications in medicine,” Journal of King Saud University-Science, vol. 33, no. 2, 2020.
View at: Publisher Site | Google Scholar
N. M. Alfaer, A. M. Gemeay, H. M. Aljohani, and A. Z. Afify, “The extended log-logistic distribution: inference and actuarial applications,” Mathematics, vol. 9, no. 12, p. 1386, 2021.
View at: Publisher Site | Google Scholar
M. S. Mukhtar, M. El-Morshedy, M. S. Eliwa, and H. M. Yousof, “Expanded Fréchet model: mathematical properties, copula, different estimation methods, applications and validation testing,” Mathematics, vol. 8, no. 11, 2020.
View at: Publisher Site | Google Scholar
A.-E. A. M. Teamah, A. A. Elbanna, and A. M. Gemeay, “Frechet-Weibull mixture distribution: properties and applications,” Applied Mathematical Sciences, vol. 14, no. 2, pp. 75–86, 2020.
View at: Publisher Site | Google Scholar
S. A. Klugman, H. H. Panjer, and G. E. Willmot, Loss Models: From Data to Decisions, vol. 715, John Wiley & Sons, New York, NY, USA, 2012.
V. A. Clark and A. J. Gross, Survival Distributions: Reliability Applications in the Biomedical Sciences, John Wiley & Sons, New York, NY, USA, 1975.
J. Wu and A. C. M. Wong, “Improved interval estimation for the two-parameter Birnbaum-Saunders distribution,” Computational Statistics & Data Analysis, vol. 47, no. 4, pp. 809–821, 2004.
View at: Publisher Site | Google Scholar
B. Efron, “Logistic regression, survival analysis, and the Kaplan-Meier curve,” Journal of the American Statistical Association, vol. 83, no. 402, pp. 414–425, 1988.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Huda M. Alshanbari et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies