Abstract

In this paper, the exponentially generated system was used to modify a two-parameter Chen distribution to a four-parameter distribution with better performance. The property of complete probability distribution function was used to verify the completeness of the resulting distribution, which shows that the distribution is a proper probability distribution function. A simulation study involving varying sample sizes was used to ascertain the asymptotic property of the new distribution. Small and large sample sizes were considered which shows the closeness of the estimates to the true value as the sample size increases. Lifetime dataset were used for model comparison which shows the superiority of exponentially generated modify Chen distribution over some existing distributions. It is therefore recommended to use the four-parameter Chen distribution in place of the well-known two-parameter Chen distribution.

1. Introduction

Chen distribution is one of the commonly used distributions in survival analysis and modeling in general. The limitation of the distribution can be observed as presented by researchers who have worked in distribution theory using either Weibull or Chen distribution. Its limitation includes inability to capture or model properly some survival data set especially skewed data set, modeling data set with heavy or light tail cannot be done using Chen distribution. Its shortcoming makes it impossible to apply in the case of some complex models. Due to the shortcomings of the distribution, there is a need for modification to make it more flexible.

Modification or generalization of density functions is necessary in modeling to address some of the challenges that cannot be captured by the existing density functions. An example of such is the modification of Weibull distribution from two parameters to three and more parameter Weibull distributions. The density function of the two-parameter Weibull distribution was given as follows:

The corresponding cumulative density function of the distribution can be expressed as follows:where are the scale and shape parameters, respectively. The density function was identified as having some shortcomings which include the inability to exhibit nonmonotonic hazard shape. For many years, using different techniques, researchers have developed various modified forms of the Weibull distribution to achieve nonmonotonic shapes. Bebbington et al. [1] proposed a more flexible two-parameter Weibull extension, having a hazard function that can be increasing, decreasing, or bathtub-shaped. In a likely manner, Zhang and Xie [2] proposed a three-parameter truncated Weibull distribution which has a bathtub-shaped hazard function. Moreover, Mudholkar and Srivastave [3] proposed three-parameter model, called the exponentiated Weibull distribution. Another three-parameter model was proposed by Marshall and Olkin [4], and this was called extended Weibull distribution. Xie et al. [5] proposed a three-parameter modified Weibull extension with a bathtub-shaped hazard function to address the shortcoming of the two-parameter Weibull distribution. Lai et al. [6] also worked on two-parameter Weibull by generalizing it to form a more robust three-parameter Weibull distribution. This was later generalized to an exponentiated forms by Carrasco et al. [7].

Chen [8] proposed a similar distribution with a nonmonotone property and some researchers worked on the distribution to address its challenges. Abdulzeid et al. [9] worked on the extension of Chen distribution in a paper titled “the modified extended Chen distribution with application to rainfall data.” The Burr–Hatke differential equation was used to generate a three-parameter modified extended Chen distribution. The resulting model was used to model occurrence of rainfall in three locations in Ghana, a country in Africa.

In a similar manner, Méndez-González et al. [10] modified the Chen distribution model using the additive methodology. Chen distribution served as the baseline function. According to the researcher, the distribution became excellently flexible in describing failure rates with nonmonotonic behavior or with the shape of a bathtub curve concerning other current models. Tarvirdizade and Ahmadpour [11] proposed a new lifetime distribution with increasing, decreasing, and bathtub-shaped hazard rate function, which is constructed by compounding of the Weibull and Chen distributions and is called Weibull–Chen (W-C) distribution. The new distribution according to the researchers is more flexible to model the bathtub-shaped hazard rate data, and its hazard rate function is not complex. Abbas et al. [12] worked on the additive Chen-Weibull (ACW) distribution with increasing and bathtub-shaped failure rate function using Bayesian and non-Bayesian approach. The researchers obtained Bayes estimator by assuming half-Cauchy under square error loss function, the Laplace Approximation.

This implies that there are several ways of modifying the density functions resulting in different properties. In this paper, the exponentially generalized system is used to modify Chen distribution which was used as the baseline function.

2. Materials and Methods

In this section, the exponentially generalized system of Cordeiro et al. [13] was used to derive the cdf of exponentially generated modify Chen (EGMC) distribution, from which the pdf of the distribution is formulated. Recall that the derivate of cdf with respect to the random variable of interest leads to pdf of the distribution.

2.1. Exponentially Generalized Modify Chen Distribution

The exponentially generated class has the cumulative density function (cdf) of the following form:using F(x) of the Chen distribution, the cumulative density function of the exponentially generated modify Weibull distribution (EGMC) is defined as follows:

is the cdf of the exponentially generated modify Chen distribution (EGMC). Taking the derivative of the abovementioned expression, the pdf of the distribution is as follows:

2.2. Area under Curve

One of the properties of a proper pdf is as follows:

Then, it is necessary to show that the pdf is a proper pdf. Considering the new pdf, the cumulative density function (cdf) of exponentially generated modify Chen distribution (EGMC) is given by the following equation:

Its corresponding probability density function (pdf) is as follows:

If f(x) is the pdf of EGMC, then

Let and

By substitution,

Let .

Substituting the abovementioned expressions, we derive the following expression:

Hence, the function is a complete probability density function.

2.3. Survival and Hazard Functions of Generalized Chen Distribution

In this section, the probability density function and the cumulative density function of the newly derived distribution are used to determine the hazard and survival functions, which can be used in the survival analysis distribution for modeling.

To obtain the survival characteristics of the distribution, let the probability distribution function of a distribution be f(x) and the corresponding cumulative density function be F(x). Then, the survival function S(x) is as follows:

Hazard function (h(x)) of the distribution becomessince

Therefore, the hazard function can be expressed as follows:

Using the abovementioned expression, the CDF and pdf of EGMC distribution arewhere S(x) is the survival function and h(x) is the hazard function of the exponentially generated modify Chen (EGMC) distribution.

2.4. Properties of Exponentially Generated Modify Chen (EGMC) Distribution

In this section, some statistical properties of the newly generated distribution were discussed. The properties include moments, moment generating function, characteristic function, median, mean, variance, mean deviation, incomplete moment, Lorenz and Bonferroni curves, conditional moments, and order statistics.

2.4.1. Moments

The expression for the pth noncentral moment of EGMC distribution is given in the following theorem.

Theorem 1. Suppose that the random variable X follows the EGMC distribution, then its pth noncentral moment is given as follows:and its mean is

Proof. The pth noncentral moment is defined by the following equation:If f(x) is the pdf of EGMC distribution, thenUsing the binomial expansion,Substituting equation (20) into (19), we can obtainApplying power series, the exponential function is as follows:Substituting (22) into (21), we haveConsider the integrand in (23).
Taking–y = xβ, x = , and dx = 
By substitution,By gamma function, ; therefore, the integrand in (24) becomesSubstituting (25) into (23), we haveThe mean is obtained by setting  = 1 in (26) as shown in the following expression:The qth central moment is given in the following theorem.

Theorem 2. Suppose that the random variable X follows the EGMC distribution, then its qth central moment is given as follows:and its variance is given as follows:

Proof. The qth central moment is defined by the following equation:where µ = E(X) and E(Xp) are given in equations (27) and (26), respectively.
By substitution,which is the central moment of EGMC distribution. It variance is obtained by setting q = 2 in (31) to obtain

2.4.2. Moment Generating Function (Mgf)

The moment generating function of EGMC distribution is given in the following theorem.

Theorem 3. Let X follow the EGMC distribution, then moment generating function is as follows:

Proof. The moment generating function of a random variable X is given by the following expression:If X follows EGMC distribution, then its moment is given in (26). Putting (26) into (34),

2.4.3. Characteristic Function

The characteristic function of the distribution is given in the following theorem.

Theorem 4. Suppose the random variable X follows EGMC distribution, then its characteristics function is as follows:

Proof. The characteristic function of a random variable X is given by the following expression:where the noncentral moment is given in (26). By substitution, the characteristic function gives the following expression:

2.4.4. Re’nyi Entropy

It is defined by , where

If X has the EGMC distribution, then

Hence, the entropy becomes

2.4.5. Incomplete Moments

This is defined as

The integrand in (41) gives the following expression:

The integrand in (42) is an incomplete gamma. Therefore,

The first incomplete moment is obtained by setting p = 1 in (43),

2.4.6. Mean Deviation

The mean deviation about the mean of EGMC distribution is given by the following expression:

The mean deviation about the median of EGMC distribution is defined as follows:where which is the first noncentral moment of EGMC distribution. can be obtained from the cdf of the EGMC, M is the median of the EGMC distribution, and () can be derived from the first incomplete moment of EGMC distribution.

2.4.7. Bonferroni and Lorenz Curves

Bonferroni curve of the EGMC distribution is defined as follows:

By substitution,where q = () can be calculated from the quintile function, (q) can be obtained from the first incomplete moment, and is the first central nonmoment.

Lorenz curve of the EGMC distribution is defined as follows:

2.4.8. Conditional Moments

The pth conditional moment of EGMC distribution is given in the following theorem.

Theorem 5. Suppose the random variable X follows the EGMC distribution, then its pth conditional moment is given by the following expression:

Proof. The pth conditional moment is defined by the following expression:where If f(x) is the pdf of EGMC distribution, thenConsidering the integrand in (52), by taking–y = xβ, x = , and dx = 
By substitution,The integrand in (53) can be expressed as follows:The first integrand is a complete gamma function and the second integrand is an incomplete gamma function. Therefore,By substitution,

2.4.9. The Order Statistics

The pdf of the order statistics of EGMC distribution is defined as follows:

Substituting the pdf and cdf of the EGMC distribution into (57) gives the following expression:where  = 1, 2, … n.

The probability density function of the maximum order statistic when  = n is given by the following expression:

The probability density function of the minimum order statistic when  = 1 is given by the following expression:

2.4.10. Estimation of the Parameters

The maximum likelihood method of parameter estimation was adopted to estimate the parameters of the proposed distribution. The maximum likelihood function is given by the following expression:

Taking the partial derivative of the equation and equating it to zero will yield a nonlinear system of equations. The solution to the nonlinear system of equations will yield ML estimation of the parameters of the new distribution.

Suppose are independent random variables with sample size n from exponentially generated modified Chen class of distribution, and its likelihood function is given by the following expression:

Its log-likelihood function is given by the following expression:

The estimates of parameters β, k, λ, and s are obtained by taking the derivative of the log-likelihood function in (63) with respect to each parameter and equating to zero. The following equations were obtained:

The abovementioned equations are nonlinear in parameters and a numerical optimization method was used to obtain inherent parameters.

2.5. Graphs of Exponentially Generated Modify Chen (EGMC) Distribution

Figures 1 and 2 show the probability density function of the distribution as parameter value changes. It can be observed that change in parameter values lead to change in the position of the line for the probability density function. Figures 1 and 2 are also an indication of the flexibility of the distribution. Irrespective of the parameter values, the distribution has proper shape of probability density of a distribution. Figure 1 shows the flexibility of the distribution and its ability to capture or be used for skewed data especially positively skewed data. This implies the modification of the baseline distribution made it more flexible and suitable for skewed data.

Figures 3 and 4 are cumulative density function graph of the distribution. Different patterns of graphs of cumulative density function were derived as a result of varying parameter values. It can be observed that the graphs show a typical structure of a proper cumulative density function. This implies the resulting cumulative function is ideal and can be applied in the related studies. The monotonic property of a distribution can be detected using the graph of cumulative density function as persistent increase or decrease in cumulative density curve implies monotone. Therefore, the modified distribution can be said to have monotonic property.

Figures 5 and 6 show the graph of survival function of the distribution. Parameter values were varied in order to get different shapes from the function. It can be observed that irrespective of the parameter values used on the function, the shape of a typical survival function is maintained which converged at zero. This shows the suitability of the new distribution in studying survival function or attributes of a variable. (See Figures 7 and 8).

The pattern of the graphs of hazard function of the exponentially generated modify Chen (EGMC) distribution shows the function can be used to model different categories of hazard functions which shows its superiority over some of the existing hazard functions.

3. Results

The Monte Carlo simulation approach was used for the study of homogenous properties of the distribution with the aid of R software (see appendix for the code). Small and large sample sizes were considered (5, 10, 15, …, 100, 200, 500, and 1000). The simulation was repeated with varying parameter values. The result is as shown as follows.

From Table 1, it can be observed that the parameters approach the true value as the sample size increases with significant reduction in the variance and MSE of model. This shows the stability of the model. For better understanding of the abovementioned table, Minitab software was used for the presentation of accuracy of parameter values with respect to sample sizes.

Figures 916 display the line charts of the accuracy of the estimates in the model. The charts present a clearer picture of the values in the table for the simulation study. The charts are in two categories. The first category is labeled 9, 11, 13, and 15 which shows the observations for all sample sizes considered and the second category is labeled 10, 12, 14, and 16 which shows better picture of the observations without extremely large sample sizes (200, 500, and 1000). The graphs were constructed with respect to the parameters in the model. The stability of each parameter can be seen vividly in the graph of variance and MSE. The “biasness” shows the closeness of estimates to the true value used in the simulation. Figure 9 shows the accuracy of the parameter β with consideration of extremely large sample sizes and Figure 10 shows the accuracy of the parameter β without extremely large sample sizes. In like manner, Figure 11 shows the accuracy of the parameter “K” with consideration of extremely large sample sizes and Figure 12 shows the accuracy of the parameter “K” without extremely large sample sizes. Similar charts were constructed for parameters λ and S as shown in Figures 1316. From the charts, it can be observed that the parameters in the distribution attained stability as sample size increases which shows the asymptotic property of the parameters.

In Table 1, 0.5 was used as the parameter value for all the parameters. For the determination of behavior of the distribution when parameter values change, different (varying) parameter values were used. In Table 2, parameter values were changed to β = 0.2, K = 0.3, λ = 0.4, and S = 0.5. Using R software with the aid of Maxlik (Henningsen and Toomet [15]), the outcome of the simulation is shown follows.

Despite varying parameter values, the results obtained are similar to that of Table 1, as the estimates approach true values as sample sizes increase with a decrease in variance and MSE. This shows the stability of the model. For better understanding of Table 2, see the figures below.

Similar to the case of first group (shown in Table 1), the charts present clearer pictures of the parameter values in Table 2. The charts can be classified into two. The first category labeled 17, 19, 21, and 23 show the observations for all sample sizes considered and second category labeled 18, 20, 22, and 24 show better picture of the observations without extremely large sample sizes (200, 500, and 1000). The graphs were constructed with respect to the parameters in the model. The stability of each parameter can be seen vividly in the graph of variance and MSE. The “biasness” shows the closeness of estimates to the true value used in the simulation. Figure 17 shows the accuracy of the parameter β with consideration of extremely large sample sizes and Figure 18 shows the accuracy of the parameter β without extremely large sample sizes. In like manner, Figure 19 shows the accuracy of the parameter “K” with consideration of extremely large sample sizes and Figure 20 shows the accuracy of the parameter “K” without extremely large sample sizes. Similar charts were constructed for parameters λ and S in Figures 2124. From the charts, it can be observed that the parameters in the distribution attained stability as sample size increases which shows the asymptotic property of the parameters.

3.1. Model Comparison

As part of performance test, there is need to compare the distribution with existing ones in the same category using secondary data. Some cases were used, having data from published research of other researchers. The comparison was made using Akaike information criterion (AIC).

3.1.1. Case I

The data set is extracted from a research work where exponential distribution was modified to generate Weibull exponential distribution. In the research, it was concluded that Weibull exponential distribution is better than the exponential distribution using AIC and Log-likelihood values on the data in Table 3. In this study, the Weibull-exponential distribution proposed by the researcher is compared with the newly generated distribution. The rating is based on AIC and Log-likelihood values. The data are on the breaking stress of carbon fibers of 50 mm length (GPa). The data have been previously used by Nichols and Padgett [16], Cordeiro and Lemonte [17], Al-Aqtash et al. [18], and Oguntunde et al. [19]. The data are as follows (See Table 4).

From Table 5, the distribution, EGMC, has higher log-likelihood value and lower AIC value compare with Weibull-exponential distribution. Therefore, it can be concluded that EGMC modeled the data better than the distribution compared with.

3.1.2. Case II

As part of performance test, there is need to compare the distribution with existing ones in the same category using secondary data. Using a data set previously used by Ahmed et al. [20] on length of 10 mm from Kandu and Raqab [21], the data set consists of 63 observations (See Table 6).

The data were previously used to show the suitability and superiority of transmitted Weibull–Pareto (TWPa) distribution over Weibull pareto (WPa), transmuted Weibull–Lonax (TWL), transmuted complimentary Weibull (TCW), and McDonald–Lomax (McL) distributions. Using the data for modeling and comparison of the distributions with EGMC, the output is as shown as follows.

Table 7 shows the superiority of EGMC distribution over five other existing distributions. It can be deduced that in modeling the data extracted from the work of Ahmed et al. [20] on length of 10 mm, the most appropriate model is EGMC distribution as it has the lowest AIC value among the AIC values for the distributions compared with.

3.1.3. Case III

This data consist of the life time (in years) of a 40 blood cancer (leukemia) patients from one of Ministry of Health Hospitals in Saudi Arabia reported in [22]. Th actual data are as follows (See Tables 8 and 9).

APKumW is Alpha Power Kumaraswamy Weibull, EGW is exponentiated generalized Weibull, and EKumW is exponential Kumaraswamy–Weibull. Considering the distributions with EGMC as shown in Table 10, Akaike information criterion of EGMC has the lowest value when compared with APKumW, EGW, and EKumW. Therefore, in modeling the data on blood cancer patients, EGMC is better used.

4. Summary and Conclusions

In this paper, the exponentially generated system was used as a method of modification of Chen distribution. Two-parameter Chen was used as the baseline function which results to a four-parameter Chen distribution. The newly generated distribution, EGMC, was tested for completeness using one of the properties of a proper probability density function. The test of completeness was done using the property called area under curve. Statistical properties of the EGMC were studied which include moment, moment generating function, characteristic function, median, mean, variance, mean deviation, incomplete moment, Lorenz and Bonferroni curves, conditional moments, and order statistics. Parameters in the formulated model were estimated using the maximum likelihood method.

Graph of probability density function, cumulative density function, survival function, and hazard function of the distribution were plot using different parameter values. Also, the Monte Carlo simulation approach was used for the study of stability (homogeneity) of the distribution. In the simulation, three replicates were used at varying parameter values and different sample sizes of 50, 100, 200, and 500. Biasness and mean square error (MSE) were used for the appropriateness of the estimates. It was observed that the estimates approach true values of the parameters as sample size increases which lead to significant reduction in the biasness and MSE. Based on these facts, it was concluded that the resulting distribution is stable and can be used for modeling.

For more fact findings, the newly modified distribution was showed to have the tendency to model some data better than some of the existing distributions. The EGMC was also compared with existing distributions in its category using lifetime dataset such as data on length of 10 mm rod previously used by Kandu and Raqab and Ahmed et al. In the ranking of the distributions using AIC, it was observed that EGMC performed better than the distributions compared with. Also, in the comparison of EGMC with alpha power Kumaraswamy–Weibull, exponentiated generalized Weibull, and exponential Kumaraswamy–Weibull, EGMC had the lowest AIC value which shows its superiority in modeling the data used.

Data Availability

The secondary data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

This research is financially supported by the authors.