Abstract

In this study, a new one-parameter Log-XLindley distribution is proposed to analyze the proportion data. Some of its statistical and reliability properties, including moments with associated measures, hazard rate function, reversed hazard rate, stress strength reliability, and mean residual life function, are investigated in closed forms which help the researchers for modeling data in a small CPU time. It is found that the density function of the introduced distribution can be used as a statistical tool to model asymmetric data. Moreover, the failure rate function can be utilized to model different types of failures, including increasing, bathtub, and J-shaped. The model parameter is estimated using various estimation approaches to get the best estimator to help us in modeling the real data in a good way with high accuracy. A Monte-Carlo simulation study for different sample sizes is performed to assess the performance of the estimations based on some statistical criteria. Finally, two distinctive data sets from SC16 and P3 algorithms, “estimating unit capacity factors,” are analyzed to illustrate the flexibility of the new model.

1. Introduction

Researchers have demonstrated a strong interest in developing new extended distributions by adding shape parameters to baseline distributions during the last decade. The primary goal of this research is to improve the modeling abilities of distributions and provide new opportunities to model various data set features. Unbounded support has received a lot of attention from researchers. However, in many real-life circumstances, such as percentages and proportions, observations can only take values within a limited range [1].

Among the unit distributions, the beta distribution is the most well-known distribution. It is frequently utilized in different fields of research, such as economics, biology, and medical sciences. The major flaw of the beta distribution is that its cumulative distribution function (CDF) cannot be written in a closed explicit form. Thus, the researchers proposed and studied various unit distributions. Among the most useful unit distributions, there are the Johnson distribution [2], Topp–Leone distribution [3], unit-gamma distribution [4], Kumaraswamy distribution [5], log-Lindley distribution [6], unit-logistic distribution [7], unit-Birnbaum–Saunders distribution [8], log-xgamma distribution [9], unit-Lindley distribution [10], unit-Gompertz distribution [11], unit-inverse Gaussian distribution [12], unit-Bur III distribution [13], log-weighted exponential distribution [14], unit-Weibull distribution [15], unit-Modified Burr III distribution [16], unit-Rayleigh distribution [17], Frechet power function distribution [18], and unit-Burr XIII distribution [19].

The purpose of this work is to introduce a new distribution for modeling data sets on the unit interval. To achieve this aim, the XLindley distribution is used to construct a new model. The proposed distribution is entitled the Log-XLindley distribution with one positive shape parameter. The Log-XLindley distribution offers many benefits over well-known unit interval distributions, including the beta and Kumaraswamy distributions. The Log-XLindley distribution is better due to its simple structure and flexibility via hazard rate. The statistical properties, including moments, skewness, kurtosis, stress strength reliability, and mean residual life, maybe derived in explicit forms.

The paper is organized as follows: In Section 2, we propose the Log-XLindley distribution. The statistical characteristics are derived in Section 3. In Section 4, we derived the reliability properties of the proposed distribution. Different computation techniques are used to estimate the model parameter in Section 5. In Section 6, Monte-Carlo simulation analysis is carried out to assess the parameter estimation techniques’ finite sample performance. Two real data sets, “SC16 and P3 algorithms: estimating unit capacity factors,” are analyzed over the interval (0, 1) to show the Log-XLindley distribution’s flexibility in Section 7. Finally, some remarks are reported based on the proposed model in Section 8.

2. The Log-XLindley Distribution

A random variable is said to have the XLindley distribution with parameter shape if its density function can be expressed as

The Log-XLindley distribution is derived from the XLindley distribution using the logarithmic transformation of type where is the XLindley distribution. The probability density function (PDF) with support is given bywhere is the shape parameter. The corresponding CDF to (2) can be formulated as

For the Log-XLindley distribution, the limiting behavior of PDF at lower and upper limits is

Figure 1 illustrates some plots of the Log-XLindley model under some specific values of the parameter .

It is noted that the density function can be used as a probability tool to discuss and analyze asymmetric data. Moreover, the shape of the density function can be either decreasing or uni-modal, which makes the proposed model can be used effectively in modeling different types of data sets in various fields.

3. Statistical Properties

3.1. Moments and Incomplete Moments with Associated Measures

If the random variables have the Log-XLindley distribution with parameter , then the ordinary moments (OM) can be expressed in an explicit form as follows:After simple computations, the OM can be expressed as

The moments of the origin can be obtained by substituting respectively. The first four moments around the origin can be expressed as

Thus, the variance and index of dispersion of the proposed model can be formulated asrespectively. Based on the measure, the introduced model can be used to mode dispersion data. Moreover, the coefficient of skewness and kurtosis can be obtained by using well-known relations via the OM property. If the random variables have the Log-XLindley distribution with parameter , then the incomplete moments (ICM) can be expressed as

Using Equation (2), we get

After simple algebra, the ICM can be formulated as

The ICM can be used to measure inequality, including income quintiles, the Lorenz curve, Pietra, and Gini measures of inequality, among others.

4. Reliability Measures

4.1. Hazard (Reversed) Rate, Cumulative Hazard Function, and Mills Ratio

If the random variable has the Log-XLindley distribution, then the survival function (SF) and its hazard rate can be written, respectively, as follows:

Mathematically, the hazard rate function (HRF) of the proposed model is bathtub-shaped at , whereas the HRF can take J- and increasing-shaped at and , respectively. Figure 2 shows some plots of the HRF based on some determined values of the parameter .

Regarding Figure 2, it is noted that the HRF can take different shapes, including bathtub, J-shaped, and increasing. Thus, the HRF of the proposed model can be used to discuss various types of data in different fields. The reversed hazard rate is

The cumulative hazard and Mills ratio can be expressed, respectively, as

4.2. Stress-Strength Reliability (SSR)

Theorem 1. If the random variablesandhave the Log-XLindley distribution with parametersand, respectively, then the SSR can be formulated in an explicit form as

Proof. The SSR can be derived from the following relation:Using (2) and (3), the SSR can be written asAfter integration and simplification, the expression of the SSR can be expressed as (19).

4.3. Mean Residual Life (MRL)

If the random variable has the Log-XLindley distribution with parameter , then the MRL can be expressed in an explicit form as follows:Using (14), we getAfter simple modification, the MRL can be expressed as

Mathematically, the MRL of the new model is unimodal-shaped at , whereas the MRL can take inverse J- and decreasing-shaped at and , respectively.

5. Estimation Techniques

5.1. Maximum Likelihood Estimator

Let be a random sample from the Log-XLindley distribution; then, the log-likelihood function is given by

Taking the partial derivative to (25) with respect to the parameter , the following equation is obtained:

The maximum likelihood estimator of is derived by solving the nonlinear equation . The resulted equation cannot be solved without using a numerical approach like the Newton–Raphson method.

5.2. Least Squares (LS) and Weighted Least Squares (WLS) Estimators

Let be an ordered sample of size from the Log-XLindley distribution. Then, the LS estimator (LSE) of the Log-XLindley parameter can be derived by minimizingwith respect to the parameter . The WLS estimate (WLSE) of , say , can be determined by minimizingwith respect to .

5.3. Anderson–Darling (AD) and Right Tail Anderson–Darling (RAD) Estimators

The AD estimator (ADE) is a minimum distance-based estimator. It can be obtained by minimizingwith respect to the parameter , whereas the RAD estimator (RADE) of the model parameter can be derived by minimizingwith respect to the parameter

5.4. Cramer–Von Mises Estimator (CVME)

The CVME is a minimum distance-based estimator. The CVME of the Log-XLindley distribution can be obtained by minimizingwith respect to the parameter .

5.5. Maximum Product of Spacing Estimator (MPSE)

For , let be the uniform spacing of a random sample from the Log-XLindley distribution, and , , and . The MPSE of parameter , say , can be derived by maximizing the geometric mean of the spacings:with respect to the parameter .

6. Monte-Carlo Simulation

To assess the performance of the estimators listed in the previous section, we conducted a comprehensive simulation study. We used the Log-XLindley distribution to generate samples with , and and then calculated the average values (AVEs) of the estimators to get the mean square errors (MSEs), average absolute biases (ABBs), and mean relative errors (MREs) for , and . The ABBs, MREs, and MSEs are given by

We ran the simulation 5000 times to derive these metrics from the prior values for all estimation approaches. The findings in Tables 15 were reported utilizing the R software’s optim-CG function. The findings show that as the sample size increased, the AVEs became closer to the real values of . Furthermore, when increases, the ABBs, MREs, and MSEs for all estimators decreased. This proves that the previous estimation techniques work quite well in estimating the model parameter.

7. Data Analysis: Data of SC16 and P3 Algorithms

In this section, we consider two datasets to show the applicability and flexibility of the introduced distributions over famous distributions. Here, we compare the Log-XLindley model with some competitive models like the Kumaraswamy (Kw) and beta (B) distributions. The PDF of the competitive models can be formulated, respectively, as follows:

Some criteria like the Kolmogorov–Smirnov (KS) test with its P value have been used to get the best model among all the tested distributions. The following data are from [20, 21], where they compare the two different algorithms called SC16 and P3 for estimating unit capacity factors. The observations resulting from the algorithm SC16 are 0.853, 0.759, 0.866, 0.809, 0.717, 0.544, 0.492, 0.403, 0.344, 0.213, 0.116, 0.116, 0.092, 0.070, 0.059, 0.048, 0.036, 0.029, 0.021, 0.014, 0.011, 0.008, and 0.006. However, the observations resulting from the second algorithm named P3 are 0.853, 0.759, 0.874, 0.800, 0.716, 0.557, 0.503, 0.399, 0.334, 0.207, 0.118, 0.118, 0.097, 0.078, 0.067, 0.056, 0.044, 0.036, 0.026, 0.019, 0.014, and 0.010. Some descriptive measures about these data sets are presented in Table 6.

Regarding Table 6, both datasets are asymmetric “positively skewed” with platykurtic-shaped data. Moreover, the datasets are under dispersion where the value of the mean is less than the various values. Information about the failure rate can be helpful in the selection of the appropriate model. A gadget known as the total time on test (TTT) plot [22] can be used for this purpose. If the shape of the TTT plot is straight diagonal, the hazard is constant. The TTT plot has a convex shape for decreasing hazards and a concave shape for increasing hazards. The bathtub-shaped hazard is obtained when first is convex and then concave. The total test time (TTT) graphs for data sets I and II are shown in Figure 3. The hazard curves of both datasets are bathtub-shaped. Figure 4 shows the boxplots for data sets I and II, respectively. Therefore, Log-XLindley distribution can be good a choice to model these data sets.

The MLEs of the considered models along with their standard errors are given in Tables 7 and 8 for data sets I and II, respectively, with goodness-of-fit measures.

Regarding Tables 7 and 8, the proposed model is the best among all tested distributions. Figures 5 and 6 support our empirical results, which have been listed in Tables 7 and 8. The profile log-likelihood plots for both data sets are presented in Figure 7.

Since one of the major aims of this paper is to get the best estimators for the data sets I and II, several estimation techniques have been applied for this purpose. Tables 9 and 10 list the various estimators for data sets I and II based on different estimation approaches.

It is noted that all methods work quite well for analyzing SC16 and P3 algorithm data, but the OLSE and CVME are the best techniques for SC16 data, whereas the OLSE method is the best for P3 data.

8. Conclusion

In this paper, a flexible one-parameter Log-XLindley distribution has been proposed to analyze and discuss the proportion and asymmetric data. Some distributional properties have been derived in explicit forms. It was found that the hazard rate function can be applied to model different types of failures including increasing, bathtub, and J-shaped. The model parameter has been estimated using various estimation approaches to get the best estimator for data. A Monte-Carlo simulation study for different sample sizes has been performed to assess the performance of the estimations based on some statistical criteria. Finally, two distinctive data sets from SC16 and P3 algorithms have been analyzed to illustrate the flexibility of the new model, and it was found that the proposed distribution proved a remarkable superiority when compared to the competitive models.

Data Availability

Data are included in the manuscript.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Supplementary Materials

The R code used in the study can be assessed from the attached file. (Supplementary Materials)