Abstract

Several standard distributions can be used to model lifetime data. Nevertheless, a number of these datasets from diverse fields such as engineering, finance, the environment, biological sciences, and others may not fit the standard distributions. As a result, there is a need to develop new distributions that incorporate a high degree of skewness and kurtosis while improving the degree of goodness-of-fit in empirical distributions. In this study, by applying the T-X method, we proposed a new flexible generated family, the Ramos-Louzada Generator (RL-G) with some relevant statistical properties such as quantile function, raw moments, incomplete moments, measures of inequality, entropy, mean and median deviations, and the reliability parameter. The RL-G family has the ability to model “right,” “left,” and “symmetric” data as well as different shapes of the hazard function. The maximum likelihood estimation (MLE) method has been used to estimate the parameters of the RL-G. The asymptotic performance of the MLE is assessed by simulation analysis. Finally, the flexibility of the RL-G family is demonstrated through the application of three real complete datasets from rainfall, breaking stress of carbon fibers, and survival times of hypertension patients, and it is evident that the RL-Weibull, which is a special case of the RL-G family, outperformed its submodels and other distributions.

1. Introduction

Choosing an appropriate statistical distribution for modeling and analyzing data is critical in order to draw more accurate conclusions. Many statistical distributions have been proposed to match different data forms over the years. Using conventional distributions for fitting these datasets may produce erroneous findings. As a result, there is a clear need for modifications to the standard distributions. The literature on probability distribution methods contains various extensions and generalizations of continuous, discrete, symmetric, and asymmetric distributions. Regarding the main methods of generating probability distributions and classes of probability distributions, Lee et al. [1] stated that the transformation technique, differential equation technique, and quantile method are three groups of methods developed prior to 1980, and those proposed after 1980 may be categorized as combination methods because these techniques attempt to develop new distributions through the combination of existing ones or by adding additional parameters to an existing distribution. Several studies have proposed using different generated classes to increase the number of parameters in distributions. The resulting distributions have found application in modeling data across various fields of study, such as environmental sciences, economics, and engineering. Some popular generators available in the literature include the exponentiated generated family by [2], the Marshall-Olkin-G by [3], the Kumaraswamy-G by [4], Beta-G by [5], Weibull-X by [6], Weibull-G by [7], the Lomax generator proposed by [8], the Topp-Leone generated family introduced by [9], the Lindley generator by [10], the Chen-G class by [11], the Burr III Topp-Leone-G by [12], the odd Burr-III family by [13], Marshall-Olkin Burr X family by [14], the Topp-Leone odd Lindley-G family by [15], and many others.

In [16], the Ramos-Louzada (RL) distribution, a one-parameter continuous distribution, was introduced for modeling lifetime data. The study demonstrated that the RL distribution performs better than some well-known lifetime distributions such as Lindley and exponential distributions. However, the RL distribution is limited to right-skewed lifetime data with an increasing failure rate. Therefore, it is essential to propose an extension or generalization of the RL to introduce flexibility in modeling different lifetime data with “symmetric” and “asymmetric” shapes and “monotonic” and “non-monotonic” failure rate functions. [17] produced the generalized Ramos-Louzada (GRL) distribution, which is the first extension of the RL distribution. In [18], the discrete RL distribution was developed and proposed. This study adopts the T-X method introduced by [19] to develop the Ramos-Louzada Generator (RL-G), which is capable of producing new distributions that are extensions or generalizations of the RL distribution. Therefore, for any continuous random variable, by applying the T-X method defined in (1), the cumulative distribution function (CDF) of random variable can be expressed using the RL-G. where is a parameter vector, is the PDF generator of a random variable , is an expression that depends on the CDF of the random variable , and is a real number.

The remaining part of the study is structured in the following manner: In Section 2, the CDF, PDF, and hazard rate function of the RL-G family of distributions are presented. Section 3 presents the mixture representation of the RL-G density functions. In Section 4, we have derived the statistical properties of the RL-G family. Parameter estimation for the proposed family of distributions is discussed in Section 5. Some special distributions of the RL-G family are discussed in Section 6. Section 7 presents Monte Carlo simulation analysis on the asymptotic performance of the MLE. Applications of the proposed distribution to three real datasets to demonstrate its flexibility and usefulness are captured in Section 8, while Section 9 presents the conclusion of the study.

2. The Ramos-Louzada Generated Family of Distributions

Given that equations (2) and (3) represent the CDF and PDF of the RL distribution,

Let represents the CDF of the baseline distribution and be a vector of parameters associated with the CDF. The proposed RL-G densities are obtained by using the T-X approach in (1) and letting , thus

Substituting (2) and (3) into the above relation, we obtain the following:

Hence, the CDF of the proposed RL-G is expressed as

The proposed RL-G family PDF is derived by finding the derivative of (7), thus

From which the survival and hazard functions are, respectively, obtained by

Some basic motivations obtained when using RL-G densities are as follows: (i)The properties of the baseline densities are enhanced(ii)An extended form of the baseline model is generated with the introduction of extra parameter(s).(iii)The kurtosis of the resulting distributions is more flexible compared to the baseline model(iv)Special models with various forms of the hazard rate function are defined

Proposition 1. The RL-G family is a valid PDF, which suffices that (i),(ii), for all

The proof of this proposition is shown in the appendix.

3. RL-G Family in Mixture Representation

This form of representation plays a very important role in deriving some statistical properties of the RL-G densities. Using the following generalized binomial series and the power series expansions on (8) where , is a real noninteger

The pdf of the RL-G family, that is (8), now becomes

Multiplying the first term of the last equation by , the second term by and rearranging, thus where

Thus, (13) represents an infinite linear combination of exp-G densities of the baseline density. The linear representation form of the RL-G facilitates the derivation of other statistical properties of the RL-G density. Integrating (13) with respect to produces the corresponding linear representation form of the CDF of the RL-G family. where , , and are power parameters of the exponentiated-G distributions and , respectively.

4. Some Relevant Statistical Properties of the RL-G Family

In this section, we have derived some relevant statistical properties of the RL-G family. These include the quantile function, the raw (noncentral) moments, measures of inequality, the entropy measure, mean and median deviations, and the reliability parameter.

4.1. The Quantile Function

By definition, the quantile function of the RL-G family is , where Now from (7), we have , if we let , solving for gives; , where the negative branch of the Lambert function, is denoted by ; see [20].

But ; and hence, the RL-G family quantile function is represented as where the baseline distribution has its inverse denoted as .

4.2. Moments

In statistical analysis, the kurtosis, mean, skewness, and variance are measures that can be computed using the noncentral moments of a distribution.

If random variable, then the th moment is defined as follows:

Substituting (13) into the above definition and simplifying, the th noncentral moments can be expressed as which can be simplified as where and :

Alternatively, (18) can be expressed in terms of the baseline quantile function, supposed in (18), then , .

As and as .

From (18), we have the th noncentral moments expressed as

4.3. Incomplete Moment

This statistical property plays an essential role in the computation of the mean and medium deviations, inequality and entropy measures, and residual life of a random variable.

The RL-G family th incomplete moment is defined as .

Setting (13) into the definition, we obtain the following: where , , and are defined in (18).

Alternatively, the th incomplete moment is expressed in terms of the baseline quantile function. Supposed in (22), then , . As and as . From (22), we have the th incomplete moments expressed as where , , and as defined before.

4.4. Measures of Inequality

The Bonferroni and Lorenz curves are two of the most commonly used measures of inequality that are applied in various fields such as insurance, demography, reliability engineering, and economics.

By definition, the Lorenz curve of the RL-G family is defined by , where is the mean and is the first incomplete moment of the RL-G family obtained by setting into the incomplete moment’s expression, that is;

Substituting into the definition for produces; where and . By definition, the Bonferroni curve of the RL-G family is;

4.5. Mean and Median Deviations

The mean deviation denoted by for RL-G random variable is defined by where is the first incomplete moment.

Hence, the mean deviation is where , , and as defined before.

The median deviation about the median denoted by for RL-G random variable is defined by .

Hence, the medium deviation about the median is expressed as where and , , and as defined before.

4.6. Entropy Measure

The randomness in the RL-G random variable is measured by using the following measures of entropy: Renyi [21], Shannon [22], Havrda and Charvat [23], and Tsallis [24].

The RL-G family has the Renyi entropy denoted by and is defined by where .

Substituting the density of the RL-G into the above definition and applying the generalized binomial expansion, the following expression is obtained:

Applying the following log power series expansion in the last expression where the constants are obtained recursively by using the following relation:

And after simplifying, we obtain the Renyi entropy as;

The RL-G family Shannon entropy is defined by;

By setting (13) into the above definition, is obtained as; where , , , and are defined in (13).

The Havrda and Charvat entropy for the RL-G family is represented by; where and , and the expression in the integral is similar to the one used in Renyi entropy. Thus, the Havrda and Charvat entropy for the RL-G family can be expressed:

The Tsallis’s generalized entropy for RL-G random variable is obtained by using the following formula: where and , from which we obtain

4.7. Reliability Parameter

Let and ; and are strength and stress random variables. The stress-strength reliability parameter of RL-G family of distribution is defined by

The simplified result from the last expression is

Evaluating each of the integrals and using the generalized binomial series expansion, the result of [25] for a power series raised to a positive integer ; where and for any integer

The reliability parameter after simplification is expressed as

5. Maximum Likelihood Estimation of the RL-G Family

Suppose the RL-G family has a random sample of size given by , then the log-likelihood function for the parameter vector is given by

Taking derivatives with respect to and

By using numerical techniques, the above equations are set to zero and simultaneously solved to obtain the maximum likelihood estimates.

6. Special Distributions of the RL-G Family

In this section, two special members of the RL-G family, the Ramos-Louzada Weibull (RLW) distribution and the Ramos-Louzada Kumaraswamy (RLKum) distribution, are derived, and the flexibility of these distributions is illustrated by displaying plots of their hazard rate and density functions at some parameter values. Simulation analysis and applications to real datasets of the RLW distribution are studied in the latter section. (1)Assuming the distribution for the baseline is the Weibull, whose CDF and PDF are, respectively, given by and , , , , , and are scale and shape parameters, respectively. The Ramos-Louzada Weibull (RLW) distribution is obtained by substituting and into equations (7) and (8). Thus, the CDF and PDF of the new RLW distribution are, respectively, obtained below:

The hazard rate function is expressed as

Figures 1 and 2, respectively, display the plots of the PDF and hazard rate function of the RLW distribution with various selections of parameter values. From Figure 1, the RLW distribution can take several forms, such as “left-skewed,” almost “symmetric,” “reversed J-shapes,” and “right-skewed,” and plots of the hazard rate function in Figure 2 illustrate various forms, such as “increasing,” “decreasing,” “J-shape,” and “reversed J-shape.”

Submodels of the RLW distribution are as follows: (i)When , we have the RL distribution given in (2)(ii)When , the Generalized RL distribution proposed by [17] is obtained. The GRL density function is expressed as(iii)When , we obtain the Ramos-Louzada Exponential (RLE) distribution. Its density is defined by(iv)When , we obtain the Ramos-Louzada Raleigh (RLR) density defined by(2)Supposed that the baseline is the Kumaraswamy distribution whose density is defined as , and , , , , and are, respectively, shape and scale parameters, and equations (53), (54), and (55), respectively, express the CDF, PDF, and failure rate function of the RLKum distribution:

Figures 3 and 4, respectively, display plots of the RLKum PDF and hazard rate function at various selections of parameter values. From Figure 3, the RLKum distribution can take various forms, such as a “reversed J-shape,” a “left-skewed” distribution, or a “J-shape.” The hazard rate function plots in Figure 4 illustrate various shapes such as “decreasing,” “increasing,” “J-shape,” “bathtub,” and “inverted bathtub.” Thus, the RLKum distribution is capable of modeling data with “non-monotonic” and “monotonic” hazard rate functions.

7. Monte Carlo Simulation

In this section, simulation analysis with sample sizes, , was performed to evaluate the properties of the ML estimators for the RLW distribution parameters by examining the average estimates (AV), the average bias (AB), and the root mean square (RMSE) for the estimated parameters. The analysis was repeated for times, with initial parameter values: (I) , , and ; and (II) , , and . The random number generation is produced by solving the CDF of the RLW with the uniroot function in R software, and the estimations are obtained with the optim function in the same software. The AB, RMSE, and AV were estimated using the following expressions: , , and , where .

Table 1 displays the simulated results of AB, RMSE, and AV for the parameter values of the RLW distribution. It can be observed that, in all cases, the AB and RMSE decrease to zero with increasing sample size. Furthermore, the AV of the estimators is quite close to the actual values. Hence, the maximum likelihood estimation and their asymptotic results perform well in estimating the RLW parameters. Similarly, alternative parameter choices can yield similar results.

8. Application

Application to three datasets of the RLW distribution is demonstrated in this section. The goodness-of-fit via Cramer-von Mises distance values (CVM), the Anderson-Darling statistic (AD), the Kolmogorov-Smirnov statistics (KS), and model selection criteria such as Bayesian information criteria (BIC), consistent Akaike information criteria (CAIC), and Akaike information criteria (AIC) of the RLW distribution, its nested models, and some other competing distributions were compared. In the first two applications, the RLW distribution was compared with its submodels, Nakagami (NAK) by [26]), inverse Weibull (INW) by [27], Nadarajah and Haghighi (NH) by [28], and modified extended Chen (MEC) by [29]. In the third application, the following nonnested models were used: Marshall-Olkin exponential (MOEx) by [3], generalized exponential (GE) by [30], and generalized inverse Weibull (GIW) by [31].

The CDF of the nonnested models are given below: (i)Generalized Inverse Weibull: (ii)Inverse Weibull: (iii)Nadarajah and Haghighi distribution: (iv)Nakagami distribution: (v)Modified extended Chen distribution: (vi)Generalized exponential: (vii)Marshall-Olkin exponential:

8.1. Dataset 1: Rainfall Data

The information displays the highest annual average monthly rainfall (in inches) that was seen in Ghana’s Ashanti region between 1989 and 2019. The dataset can be found in [32]. The dataset contains the following:

12.469, 7.079, 11.929, 11.370, 12.906, 8, 7.394, 7.063, 12.213, 9.654, 8.327, 7.228, 10.689, 10.413, 10.039, 8.984, 10.508, 7.614, 12.165, 11.201, 8.988, 8.594, 10.961, 8.350, 9.882, 11.720, 10.272, 9.311, 8.854, 9.819, and 11.863. A graphical representation of the dataset using the hazard function is displayed in Figure 5. The total test on time (TTT) plot indicates that the curve has an increasing hazard rate.

Table 2 shows the ML estimates, standard errors, and values of the parameters of the fitted distributions for the rainfall data. Two parameters of the RLW are statistically significant at the 5% significance level, except for the INW. The GRL, RL, NAK, and NH have all their estimated parameters statistically significant at the 5% significance level.

From Table 3, a better fit is provided by the RLW to the rainfall data compared to its submodels and the nonnested models because it has the maximum value of log-likelihood and the smallest CVM, AD, AIC, CAIC, and BIC. A close competitive model to the RLW is the NAK.

From the likelihood ratio test (LRT) results in Table 4, it is obvious that significant differences exist between RLW and its submodels based on the LRT test since their LRT statistics values are, respectively, greater than the critical values at the 5% level of significance.

The graphs of the fitted PDFs versus the histogram of the data are displayed in Figure 6, and the fitted CDFs versus the empirical data are displayed in Figure 7. It is noted that the plots of the densities of the RLW depict the empirical density and CDF of the maximum annual rainfall data more closely than the other models.

8.2. Dataset 2: Hypertension Data

This dataset shows the survival periods in years before the development of hypertension for 119 patients randomly selected from the Bolgatanga Regional Hospital in Ghana’s Upper East region. The dataset is in [33], and it has the following items:

71, 5, 39, 62, 52,71, 38, 56, 35, 69,34,71,66,70, 52, 37, 35, 71, 73, 19, 74, 74, 75, 51, 76,49, 19, 76, 78, 76, 76, 49, 47, 48, 48, 46, 46, 46, 41, 40, 43, 45, 47, 47, 44, 45, 46, 42, 43, 42, 20, 28, 26, 60, 27, 24, 29, 60, 25, 60, 69, 36, 69, 69, 68, 68, 67, 67, 67, 52, 35, 66, 55, 66, 61, 61, 64, 64, 65, 65, 63, 63, 62, 39, 62, 62, 62, 59, 59, 59, 58, 58, 58, 18, 57, 57, 56, 56, 37, 53, 53, 53, 53, 54, 54, 66, 17, 50, 75, 51, 38, 52, 66, 4, 52, 55, 19, 58, and 73. Figure 8 shows a graphic depiction of the dataset using the hazard function. The RLW distribution can therefore be used to represent the curve because the TTT plot shows that the hazard rate is growing.

The parameter estimates for the fitted models for the hypertension data are shown in Table 5, along with their standard errors and values. In addition, the estimated parameters for the GRL, RL, and NH models are all statistically significant at the 5% level of significance. Two other estimated parameters of the RLW are also significant at this level. Based on Table 6, the RLW distribution offers a better match to the hypertension data compared to its nested models and the other distributions since it has the least CVM, AD, KS, AIC, and CAIC as well as the highest log-likelihood value.

It is clear from Table 7’s likelihood ratio test (LRT) results that there are significant differences between RLW and its submodels based on the LRT test, as each of their LRT statistical values exceeds the critical values at the 5% level of significance.

Figures 9 and 10, respectively, are graphs showing the fitted PDFs against the data’s histogram and the fitted CDFs against the empirical data. It should be observed that the RLW plots of densities more accurately represent the empirical density and CDF of the hypertension data than the other models.

8.3. Dataset 3: Carbon Fiber Data

The third application of the RLW with other competing models and its submodels is demonstrated in this section with the breaking stress of carbon fibers. It consists of the breaking stress of 50 mm-long carbon fibers. The dataset is in [34], and it includes the following:

3.70, 2.74, 2.73, 2.5, 3.6, 3.11, 3.27, 2.87, 1.47, 3.11, 4.42, 2.41, 3.19, 3.22, 1.69, 3.28, 3.09, 1.87, 3.15, 4.90, 3.75, 2.43, 2.95, 2.97, 3.39, 2.96, 2.53, 2.67, 2.93, 3.22, 3.39, 2.81, 4.20, 3.33, 2.55, 3.31, 3.31, 2.85, 2.56, 3.56, 3.15, 2.35, 2.55, 2.59, 2.38, 2.81, 2.77, 2.17, 2.83, 1.92, 1.41, 3.68, 2.97, 1.36, 0.98, 2.76, 4.91, 3.68, 1.84, 1.59, 3.19, 1.57, 0.81, 5.56, 1.73, 1.59, 2.00, 1.22, 1.12, 1.71, 2.17, 1.17, 5.08, 2.48, 1.18, 3.51, 2.17, 1.69, 1.25, 4.38, 1.84, 0.39, 3.68, 2.48, 0.85, 1.61, 2.79, 4.70, 2.03, 1.80, 1.57, 1.08, 2.03, 1.61, 2.12, 1.89, 2.88, 2.82, 2.05, and 3.65

A graphical representation of the dataset using the hazard function is displayed in Figure 11. The plot of the TTT indicates that the curve has an increasing hazard rate and hence can be modeled using the RLW distribution and the other competing models.

Table 8 exhibits the maximum likelihood estimates, standard errors, and values of the parameters of the fitted models for the carbon fiber data. The estimated parameters of the RLW and the other distributions are significant at the 5% level of significance, except for one parameter of the INW. From Table 9, the RLW distribution provides a better fit to the carbon fiber data compared to its nested models and the other distributions because it has the smallest CVM, AD, KS, AIC, and CAIC and the greatest value of the log-likelihood.

The LRT results of Table 10 indicate that there is no significant difference between the RLW and the GRL distribution since the LRT statistics values are less than the critical values at the 5% level of significance. On the other hand, there is a significant difference between the RLW and the RLRa distribution.

The histogram of the data against the PDFs of the fitted models and the fitted CDFs vs. the empirical carbon fiber data are, respectively, exhibited in Figures 12 and 13. It is observed from the plots that the RLW densities depict the empirical density and CDF of the carbon fiber data more closely than the other models.

9. Conclusion

In reality, the lack of developed family of distributions based on the RL distribution in the scientific literature served as motivation for our study. In this paper, a new family of distributions, the RL-G family, with its statistical properties such as the quantile function, the raw moments, the incomplete moments, measures of inequality, entropy, mean and median deviations, and the reliability parameter, is studied. The parameters of the proposed generator were estimated using the ML estimation method. The RLW and the RLKum are two special members of the RL-G family. The outcome of the simulation analysis indicates that the ML estimation method and its asymptotic properties performed quite well. Applications of the RLW from the RL-G family were carried out on three complete real datasets, and it is evident that the RLW outperformed its submodels and other distributions. As a result, the newly suggested family of distributions has a broader range of applications in a variety of areas. Despite the RL-G model’s numerous advantages, such as flexibility, generalization of the RL distribution, and the ability to provide superior fits to the dataset in comparison to other compared models available in the literature, it cannot be employed for assessing discrete datasets, and expressions of its estimators are difficult to reduce to a simple, closed-form.

Appendix

A. Proof of Proposition

A.1. Proof

(i)From the CDF of the RL-G family in (7), as , and , and as , and thus .

Hence, , for all (ii)Demonstrating that the integration over the support is 1, that is

Let , as and

Thus

The function in the integral is the PDF of the RL distribution. Thus, from the above discussions, the RL-G family is a legitimate PDF for the continuous random variable .

Data Availability

The (rainfall, hypertension, and carbon fiber) data used for this study is duly cited and is available in the text.

Conflicts of Interest

The authors declare that they have no conflicting interests.

Authors’ Contributions

All authors have significantly contributed to the research and preparation of the manuscript.