Abstract

In this study, Secant Kumaraswamy family of distributions is proposed and studied. This is motivated by the fact that no one distribution can model all types of data from different fields. Therefore, there is the need to develop distributions with desirable properties and flexible enough for modelling data exhibiting different characteristics. Some properties of the new family of distributions, including the quantile function, moments, moment generating function, and mean residual life function, are derived. Five special cases of the family of distributions are presented, and their flexibility is shown by the varying degrees of skewness and kurtosis and nonmonotonic hazard rates. The maximum likelihood estimation method is used to obtain estimators of the family of distributions. Two location-scale regression models are developed for the Secant Kumaraswamy Weibull distribution, which is a special case of the family of distributions. Six different real datasets are used to demonstrate the usefulness of the family of distributions and the regression models. The results show that the family of distributions can be used to model real datasets.

1. Introduction

Data is constantly being generated in all fields including engineering, medicine, and finance. The understanding of data is very important for making critical decisions in these fields. Thus, it is extremely important for data to be appropriately modelled for this purpose. Finding a model that can appropriately describe the data is crucial to unearthing meaningful information from it. Probability distributions are useful for modelling data. However, data from various fields exhibit varying characteristics making it impossible for any single distribution to be used in modelling data from all fields. Thus, over the years, several distributions have been developed that exhibit various degrees of flexibility for the purposes of modelling data from various fields.

Over the years, several methods have been used to develop distributions. A very popular method is the use of generators or families of distributions. This involves modifying a baseline distribution by substituting it into the families of distributions. Some families of distributions developed include Kumaraswamy-G (Kw-G) [1], Zubair-G family [2], Sin-G [3, 4], Cos-G [3, 4], Tan-G [3], extended odd Fréchet-G family [5], arcsine exponentiated-X family [6], tangent Topp-Leone-G [7], Marshall-Olkin Zubair-G [8], secant-G class (Sec-G) [9], new cotangent-G [10], cosine Topp-Leone-G [11], weighted cosine-G [12], transmuted modified power-generated family [13], modified-half normal family [14], and logistic cotangent exponentiated generalized family [15].

This study develops a new family of distributions by using Kumaraswamy generator (Kw-G) developed by Cordeiro and De Castro [1] and the secant generator (Sec-G) developed by Souza et al. [9]. Trigonometric extensions have become very popular for the development of distributions in recent years because of the advantage of modifying a distribution without the addition of a new parameter and the additional advantage of the varying nature of trigonometric functions. Therefore, this study is motivated by the need to obtain more flexible distributions with varying degrees of kurtosis and skewness, including nonmonotonic hazard rate shapes. Thus, the new family of distributions derived in this study is more flexible and, hence, more useful in modelling datasets with varying characteristics. The study also introduces regression models with two different structures, which take advantage of the flexibility of the developed family of distributions.

The remainder of the study is organized as follows: Section 2 presents the Secant Kumaraswamy family of distributions. The mixture representation of the family of distribution is also presented in the section. Some statistical properties of the distribution are presented in Section 3. Section 4 presents five special cases of the family of distributions. Maximum likelihood estimators of the family of distributions are presented in Section 5. Monte Carlo simulation studies on the estimators are performed using a special case of the family of distributions in the section. Real datasets are used to demonstrate the usefulness of the family of distributions in Section 6. Two location-scale regression models are presented in Section 7 with applications to a real datasets. The conclusion of the study is presented in Section 8. An illustration of the framework of the study is presented in Figure 1.

2. Secant Kumaraswamy Family of Distributions

Cordeiro and De Castro [1] developed the Kumaraswamy-G (Kw-G) family of distributions with cumulative distribution function (CDF) given as . The CDF of the Sec-G family of distributions proposed by Souza et al. [9] is given as .

Substituting the CDF of the Kw-G family of distributions into the CDF of the Sec-G family of distributions gives the CDF of the SKw-G family of distributions as

Differentiating equation (1) gives the probability density function (PDF) of the SKw-G family of distributions as

For simplicity, let and . The survival function and hazard rate function (HRF) of the SKw-G family of distributions are given, respectively, as and

2.1. Mixture Representation of SKw-G Family

Mixture representations are useful in the derivation of some statistical properties of the SKw-G family of distributions. The mixture representation of the PDF of the SKw-G distribution is obtained in this section.

Lemma 1. The mixture representation of the SKw-G family of distributions is obtained as where

Proof. From Souza et al. [9], the expansion of the Sec-G family of distributions is given as where is the Euler number. Thus, substituting the expansion into the PDF in equation (2) gives the expression Given that , applying binomial series expansion on gives Substituting equation (8) into equation (7) further gives the PDF as Further application of the binomial series expansion and simplification of the PDF in equation (9) gives Letting in equation (10) completes the proof.

3. Statistical Properties

In this section, some statistical properties of the SKw-G family of distributions are derived. These properties include the quantile function, moments, moments generating function, and mean residual function.

3.1. Quantile Function

The quantile function of the SKw-G distribution is presented in this subsection. The quantile function is useful for random generation of numbers of a given distribution. It can also be used to obtain the skewness and kurtosis of a distribution.

Proposition 2. The quantile function of the SKw-G distribution is given by where denotes the quantile function of the baseline distribution.

Proof. Letting the CDF of the family of distributions in equation (1) to be equal to and making the subject yield the quantile function of the SKw-G family of distributions.

3.2. Moments and Moment Generating Function

This section presents the ordinary moment, incomplete moment, and moment generating function of the SKw-G family of distributions. The functions can be used to characterize the distribution.

3.2.1. Ordinary and Incomplete Moments

The ordinary moments of a distribution are useful in obtaining the central tendencies and dispersions of a distribution, among other uses.

Proposition 3. The SKw-G family’s ordinary moment is given as

Proof. The ordinary moment of a distribution can be defined as .
The proof is complete when the mixture representation in equation (5) is substituted into the definition of the ordinary moment.

Proposition 4. The SKw-G family’s incomplete moment is given by

Proof. The incomplete moment by definition is The proof is complete when the mixture representation in equation (5) is substituted into the definition of the incomplete moments .

3.2.2. Moment Generating Function

The moment generating function of a distribution, if it exists, is useful for obtaining the moments of a distribution. The moment generating function of the SKw-G distribution is obtained in this subsection.

Proposition 5. The moment generating function of the SKw-G family is obtained as

Proof. By definition . Using the Taylor series expansion, , then Substituting in equation (13) into the definition of the moment generating function completes the proof.

3.3. Mean Residual Life

The mean residual life function of a distribution is useful in describing the remaining life time of a system. It has usefulness in many fields such as engineering, actuarial science, and biomedical sciences, among others.

Proposition 6. The mean residual life of the SKw-G family of distributions is obtained as

Proof. The mean residual life by definition is given as . The substitution of the mixture representation of in equation (5) into completes the proof.

4. Some Special Cases of the SKw-G Family

Special cases of the SKw-G family of distributions are presented in this section. These special distributions used Weibull, Chen, Burr XII, Gompertz, and Fréchet distributions as the baseline distributions.

4.1. Secant Kumaraswamy Weibull Distribution

Given the Weibull distribution [16], with CDF and PDF given as and , , respectively. Substituting the CDF of the Weibull distribution into the CDF of the SKw-G family of distributions in equation (1) gives the CDF of the Secant Kumaraswamy Weibull (SKwW) distribution as

The corresponding PDF of the SKwW distributions can be obtained by either substituting the PDF and CDF of the Weibull distribution into the PDF of the SKw-G family of distributions in equation (2) or by differentiating equation (17). Thus, the PDF of the SKwW distribution is obtained as

The HRF of the SKwW distribution is obtained as

It should be noted that the SKwW distribution is the same as the Sec-Kum-W distribution by Souza et al. [9]. Figure 2 shows the PDF and HRF of the SKwW distribution. It can be observed that the PDF exhibits decreasing, increasing, left skewed, right skewed, and approximately symmetric shapes. Also, it can be observed that the HRF exhibits decreasing, increasing, bathtub, and reverse bathtub shapes.

4.2. Secant Kumaraswamy Chen Distribution

Secant Kumaraswamy Chen (SKwC) distribution is also presented by substituting the Chen distribution [17]. The CDF and PDF of the Chen distribution are given, respectively, as and , . The CDF of the SKwC distribution is obtained by substituting the CDF of the Chen distribution into the CDF of the SKw-G family of distributions in equation (1). Also, the PDF of the SKwC distribution is obtained by substituting the CDF and PDF of the SKwC distribution into the PDF of SKw-G family of distributions in equation (2). Thus, the CDF and PDF of the SKwC distribution are given, respectively, as and

Also, the HRF of the SKwC distribution is given as

Figure 3 shows the plots of the PDF and HRF of the SKwC distribution. The PDF exhibits decreasing, increasing, right skewed, left skewed, and symmetric shapes. HRF shows increasing, decreasing, bathtub, and modified bathtub shapes.

4.3. Secant Kumaraswamy Burr XII Distribution

The Secant Kumaraswamy Burr XII (SKwBXII) distribution is obtained in this section by using the three-parameter Burr XII distribution [18] with CDF and PDF, respectively, given as and , . Using the CDF and PDF of the SKw-G family of distributions given in equations (1) and (2), respectively, the CDF and PDF of the SKwBXII distribution are obtained, respectively, as and

The HRF of the SKwBXII distribution is also given as

Figure 4 shows the PDF and HRF plots of the SKwBXII distribution. It can be observed that the PDF exhibits decreasing, increasing, right skewed, left skewed, and symmetric shapes, whiles the HRF exhibits increasing, decreasing, bathtub, and modified bathtub shapes.

4.4. Secant Kumaraswamy Gompertz Distribution

The Gompertz distribution [19] is used as the baseline distribution for the proposed secant Kumaraswamy Gompertz (SKwG) distribution. The CDF and PDF of the Gompertz distribution, respectively, are and , . The CDF and PDF of the SKwG distribution are obtained, respectively, as and

The HRF of the SKwG distribution is obtained as

The PDF and HRF plots are shown in Figure 5 for the SKwG distribution. It can be observed that the PDF exhibits decreasing, increasing, right skewed, left skewed, and approximately symmetric shapes whiles the HRF exhibits increasing, decreasing, and bathtub shapes.

4.5. Secant Kumaraswamy Fréchet Distribution

The Secant Kumaraswamy Fréchet (SKwF) distribution uses the Fréchet distribution [20] with CDF and PDF, respectively, given as and , as the baseline distribution. The CDF of SKwF distribution is obtained as

The PDF of the SKwF distribution is obtained as

The HRF of the SKwF distribution is also obtained as

Figure 6 shows the PDF and HRF plots for the SKwF distribution. It can be observed that the PDF shows decreasing, left skewed, right skewed, and approximately symmetric shapes, whiles the HRF shows decreasing, increasing, and upside-down bathtub shapes.

5. Parameter Estimation

The estimation of the parameters of the SKw-G family of distributions is given in the section. The maximum likelihood method is used for estimating the parameters of the distribution. Also, a simulation study is conducted in this section to ascertain the behavior of the parameter estimators.

5.1. Maximum Likelihood Estimation

Let be samples from the SKw-G family of distributions with density function . Also, let be a vector of parameters. Then, the log-likelihood function is given as

The log-likelihood function for the SKw-G family of distributions is obtained by substituting its PDF in equation (2) into equation (32). The substitution, with some algebraic manipulation, gives

Differentiating equation (33) with respect to each parameter gives the score functions. This is obtained as where . Equating the score functions presented in equations (34)–(36) to zero and solving them simultaneously give the maximum likelihood estimators of the distribution. The score functions can be observed to be nonlinear functions. Therefore, numerical methods, such as quasi-Newton-Raphson method, are employed to obtain the estimates of the distribution.

For large sample sizes, the maximum likelihood estimates for the SKw-G family of distributions converges to the normal distribution. That is , where denotes convergence in distribution, is the inverse of the expected Fisher information matrix. According to Lindsay and Li [21], the observed Fisher information matrix is a consistent estimator of the expected Fisher information matrix. Thus, the observed information matrix for the SKw-G family of distributions is given as

The observed information matrix can be used for interval estimation of the parameters of the family of distributions. Given the standard error of the parameter set as and the upper quantile of the normal distribution as , the confidence intervals of the estimated parameters can be obtained as . The diagonal of the observed Fisher information matrix gives the variance-covariance matrix for the parameters, whiles the square root of the diagonal of the matrix gives the standard errors of the parameters.

5.2. Simulation Study

To ascertain the behavior of the maximum likelihood estimators of the SKw-G family of distributions, Monte Carlo simulations are performed. The special distribution SKwC is used for illustrative purposes. The following steps are used for the simulation study: (a)Generate random samples of sizes from the SKwC distribution using its quantile function(b)Estimate the parameters of the distributions using maximum likelihood method(c)Calculate the average estimate (AE), average bias (AB), and root mean square error (RMSE) of the parameters(d)The process is repeated 2000 times(e)The process is also repeated for the two set of parameters: and

The simulation results are presented in Table 1. The results from the table show that the estimators are consistent and asymptotically unbiased as the average estimates approach the true values whiles the average bias and root mean square errors decrease with increasing sample sizes.

6. Applications

The applications of the special distributions of the SKw-G family of distributions are given in this section. This is to illustrate the usefulness of the family of distributions.

The following datasets are used for the illustration: the first dataset represents 101 observations that show the failure times (in hours) of Kevlar 49/epoxy strands subjected to constant sustained pressure at a 90 percent stress level. The data was originally given by Barlow et al. [22] and has been analyzed in several studies. The observations are as follows: 0.01, 0.02, 0.02, 0.02, 0.03, 0.03, 0.04, 0.05, 0.06, 0.07, 0.07, 0.08, 0.09, 0.10, 0.10, 0.11, 0.11, 0.12, 0.13, 0.18, 0.19, 0.20, 0.23, 0.24, 0.29, 0.34, 0.35, 0.36, 0.38, 0.40, 0.42, 0.43, 0.52, 0.54, 0.56, 0.60, 0.63, 0.65, 0.67, 0.68, 0.72, 0.72, 0.72, 0.73, 0.79, 0.79, 0.80, 0.80, 0.85, 0.90, 0.92, 0.95, 0.99, 1.00, 1.01, 1.02, 1.03, 1.05, 1.10, 1.10, 1.15, 1.18, 1.20, 1.29, 1.31, 1.33, 1.34, 1.40, 1.43, 1.45, 1.50, 1.51, 1.53, 1.54, 1.54, 1.55, 1.58, 1.60, 1.63, 1.64, 1.80, 1.80, 1.81, 2.02, 2.14, 2.17, 2.33, 3.03, 3.34, 4.20, 4.69, and 7.89.

The second dataset represents the survival times of breast cancer patients obtained from 1929 to 1938. The data can be found in Ramos et al. [23] and has been analyzed by several studies including Yakubu et al. [24]. The observations are given as 0.3, 0.3, 4.0, 5.0, 5.6, 6.2, 6.3, 6.6, 6.8, 7.4, 7.5, 8.4, 8.4, 10.3, 11.0, 11.8, 12.2, 12.3, 13.5, 14.4, 14.4, 14.8, 15.5, 15.7, 16.2, 16.3, 16.5, 16.8, 17.2, 17.3, 17.5, 17.9, 19.8, 20.4, 20.9, 21.0, 21.0, 21.1, 23.0, 23.4, 23.6, 24.0, 24.0, 27.9, 28.2, 29.1, 30.0, 31.0, 31.0, 32.0, 35.0, 35.0, 37.0, 37.0, 37.0, 38.0, 38.0, 38.0, 39.0, 39.0, 40.0, 40.0, 40.0, 41.0, 41.0, 41.0, 42.0, 43.0, 43.0, 43.0, 44.0, 45.0, 45.0, 46.0, 46.0, 47.0, 48.0, 49.0, 51.0, 51.0, 51.0, 52.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 60.0, 60.0, 60.0, 61.0, 62.0, 65.0, 65.0, 67.0, 67.0, 68.0, 69.0, 78.0, 80.0, 83.0, 88.0, 89.0, 90.0, 93.0, 96.0, 103.0, 105.0, 109.0, 109.0, 111.0, 115.0, 117.0, 125.0,126.0, 127.0, 129.0, 129.0, 139.0, and 154.0.

The third dataset is obtained from Smith and Naylor [25] and represents the strengths of 1.5 cm glass fibres, measured at the National Physical Laboratory, England. The data are 0.55, 0.74, 0.77, 0.81, 0.84, 1.24, 0.93, 1.04, 1.11, 1.13, 1.30, 1.25, 1.27, 1.28, 1.29, 1.48, 1.36, 1.39, 1.42, 1.48, 1.51, 1.49, 1.49, 1.50, 1.50, 1.55, 1.52, 1.53, 1.54, 1.55, 1.61, 1.58, 1.59, 1.60, 1.61, 1.63, 1.61, 1.61, 1.62, 1.62, 1.67, 1.64, 1.66, 1.66, 1.66, 1.70, 1.68, 1.68, 1.69, 1.70, 1.78, 1.73, 1.76, 1.76, 1.77, 1.89, 1.81, 1.82, 1.84, 1.84, 2.00, 2.01, and 2.24.

The fourth dataset composes of 72 survival times, in days, of guinea pigs, infected with tubercle bacilli. The dataset can be found in Bjerkedal [26]. The observations are as follows: 12, 15, 22, 24, 24, 32, 32, 33, 34, 38, 38, 43, 33, 48, 52, 53, 54, 54, 55, 56, 57, 58, 58, 59, 60, 60, 60, 60, 61, 62, 63, 65, 65, 67, 68, 70, 70, 72, 73, 75, 76, 81, 83, 84, 85, 87, 91, 95, 96, 98, 99, 109, 110, 121, 127, 129, 131, 143, 146, 146, 175, 175, 211, 233, 258, 258, 263, 297, 341, 341, and 376.

The fifth dataset represents the maximum flood levels obtained. The data can be obtained from Dumonceaux and Antle [27]. The data consists of 20 observations and are given as follows: 0.654, 0.613, 0.315, 0.449, 0.297, 0.402, 0.379, 0.423, 0.379, 0.324, 0.269, 0.740, 0.418, 0.412, 0.494, 0.416, 0.338, 0.392, 0.484, and 0.265.

The descriptive statistics of the four datasets are given in Table 2. It can be observed that all the datasets are rightly skewed, except the glass fibre dataset. Also, the Kevlar 49/epoxy dataset is highly peaked as compared to the normal distribution, whiles all the other datasets are less peaked with the survival times of guinea pigs being moderately less peaked as compared to the normal distribution.

The total time on test (TTT) transform plots of the datasets are shown in Figure 7. The TTT transform graph is used to obtain the hazard failure rate shape for a given dataset. The TTT transform plot of the Kevlar 49/epoxy data is given in Figure 7(a). The failure rate is first convex in shape, followed by a concave shape and then a convex shape again. This indicates that the dataset exhibits a modified bathtub shape. Figures 7(b), 7(c), and 7(e) give the TTT plot for the breast cancer, glass fibre, and maximum flood datasets. The plots show that the three datasets have an increasing failure rate since the curves are above the 45° line. Figure 7(d) shows the TTT plot for the survival times of guinea pigs. The plot shows a failure rate shape of a modified bathtub since the curve initially goes above the diagonal, then goes below the diagonal, and finally goes above the diagonal again.

The characteristics of the datasets suggest that the various special cases of the SKw-G family of distributions can be used to model them. The performances of the special distributions of the SKw-G family of distributions are compared with the performance of existing distributions.

The special cases of the SKw-G family of distributions, SKwC, SKwW, SKwBXII, and SKwF distributions, are compared with Weibull Nadarajah-Haghighi (WNH) [28], generalized power Weibull (GPW) [29], exponentiated generalized Poisson inverse exponential (EGPIE) [30], Weibull inverted exponential (WIE) [31], modified Weibull (MW) [32], Kumaraswamy-Burr III (KB III) [33], complementary exponential power (CEP) [34], and odd generalized exponential Weibull (OGEW) [34] distributions. Others include generalized odd inverse exponential Lomax (GOIEL) [24], Kumaraswamy inverse exponential (KIE) [35], Weibull (W) [16], odd inverse exponential Weibull (OIEW) [24], exponentiated odd inverse exponential Weibull (EOIEW) [24], new sine inverse Weibull (NSIW) [36], sine inverse Weibull (SIW) [36], inverse Weibull (IW) [37], inverted Nadarajah-Haghighi (INH) [38], the exponentiated generalized inverse Weibull (EGIW) [39], Kumaraswamy Burr III [40], and new Weibull Pareto [41] distributions.

The performance of the fitted distributions is compared using log-likelihood (), Akaike information criteria (AIC), corrected Akaike information criteria (AICc), Bayesian information criteria (BIC), and the Kolmogorov-Smirnov (K-S) goodness-of-fit measure. In general, higher values of the log-likelihood and smaller values of the AIC, AICc, BIC, and K-S of a particular model, the better the fit of the model to the dataset under consideration.

6.1. Dataset 1: Kevlar/Epoxy Data

Table 3 shows the maximum likelihood parameter estimates as well as their corresponding standard errors in brackets for the Kevlar data.

The log-likelihood, goodness-of-fit statistics, and the information criterion values of the SKwC and the other competing models for the Kevlar data are presented in Table 4. It can be observed that the SKwC gives a better fit to the data than the other models since it has the highest log-likelihood value and smallest values of the AIC, AICc, BIC, and K-S.

The estimated PDFs and CDFs of the fitted models, as well as the empirical densities for the Kevlar data, are shown in Figure 8. It can be observed that the PDF and CDF of the SKwC follow the empirical density and the empirical CDF closely as compared to the other models.

The probability-probability (P-P) plots of the SKwC and the other competing models are shown in Figure 9. The plots indicate that the SKwC fits the Kevlar dataset better than the competing models since it has almost all of its points along the diagonal line.

The variance-covariance matrix of the parameters of the SKwC for the Kevlar data was estimated and presented below as

The variance of the MLE of the parameters is as follows: , , , and . The 95% confidence intervals of the estimated parameters are (0, 6.806), (0, 0.343), (0.399, 1.395), (0.152, 2.276), and (0.120, 1.512).

6.2. Dataset 2: Breast Cancer Data

The MLE parameter estimates and their corresponding standard errors in brackets are shown in Table 5 for the fitted distributions.

Table 6 shows the , AIC, AICc, BIC, and K-S values for the breast cancer data. From the table, the SKwW model fits the breast cancer data better than the other competing models according to the criteria given.

The PDF and CDF plots of the SKwW and the other competing models are shown in Figure 10. It shows the flexibility of the SKwW model since it is able to mimic the empirical PDF and CDF of the cancer dataset better than the other existing models.

The P-P plots are shown in Figure 11 for the models under consideration. The plot of the SKwW distribution shows how well the SKwW distribution fits the cancer dataset. It has most of its plotted points on the diagonal as compared to the existing models.

The variance-covariance matrix for SKwW distribution using the cancer dataset is presented as follows.

The variance of the MLE of the parameters is as follows: , , , and . The 95% confidence intervals of the estimated parameters of SKwW distribution are (0, 2.9665), (0, 0.5251), (0, 1.0831), and (0, 9.2683).

6.3. Dataset 3: Glass Fibre Strength Data

Table 7 shows the MLE parameter estimates for the models under consideration and their corresponding standard errors in brackets for the glass fibre data.

Table 8 shows the values of the goodness-of-fit statistics for the glass fibre data. From Table 8, the SKwBXII model fits the glass fibre data better than the other competing models according to the criteria given above.

Figure 12 shows the PDF and CDF plots of the SKwBXII and the other competing distributions. It can be observed that SKwBXII closely mimic the empirical PDF and CDF of the dataset.

The P-P plots for the fitted distributions are shown in Figure 13 for the models under consideration. The plot for SKwBXII distribution indicates that the SKwBXII distribution well fits the cancer dataset.

The variance-covariance matrix for SKwBXII distribution is presented as follows.

The variance of the MLE of the parameters is as follows: , , , and . The 95% confidence intervals of the estimated parameters are (0, 2.5880), (1.4864, 1.7652), (10.4411, 41.5314), and (0.0706, 1.0298).

6.4. Dataset 4: Survival Times of Guinea Pig Data

Table 9 shows the parameter estimates and their corresponding standard errors in brackets for the fitted dataset.

The fitted models are compared using log-likelihood, AIC, BIC, and K-S. The results are shown in Table 10. SKwF distribution can be observed to have performed better than the other competing models as it has the least of the information criteria and goodness-of-fit.

Figure 14 shows the PDF and CDF plots of the SKwF distribution and the other competing distributions. The PDF and CDF of the SKwF distribution closely mimic the empirical PDF and CDF as compared to the other existing distributions.

Figure 15 shows the P-P plots for the fitted distributions. It can be observed that the data points on the plot for SKwF lie more along the diagonal as compared to other distributions. This suggests that the SKwF distribution best fits the data.

The variance-covariance matrix for the dataset under consideration for SKwF distribution is presented as follows.

The variance of the MLE of the SKwF distribution parameters is as follows: , , , and . The 95% confidence intervals of the estimated parameters are (0.0450, 1.0591), (3.6384, 6.8508), (0.8798, 5.8672), and (0, 23.0863).

6.5. Dataset 5: Maximum Flood Data

Table 11 shows the estimated parameters and their corresponding standard errors for the fitted models for maximum flood dataset.

The goodness-of-fit statistics is presented in Table 12. It can be determined that all the proposed models have outperformed the existing models in fitting the maximum flood data as they have the least of measures. It can be observed that SKwBXII distribution is especially best in modelling the data.

The variance-covariance matrix for estimated parameters of SKwBXII distribution is presented as

The variance of the MLE of the SKwBXII distribution parameters is as follows: , , , and . The 95% confidence intervals of the estimated parameters are (0.0450, 1.0591), (3.6384, 6.8508), (0.8798, 5.8672), and (0, 23.0863).

The empirical PDF and CDF as well as those of the fitted models are shown in Figure 16. The proposed models have been able to assume the empirical PDF and CDF better than the existing models.

The P-P plots of the fitted models are presented in Figure 17. It can be observed that all the proposed models have almost all of their plotted points on the diagonal as compared to the existing models.

7. The Log-SKwW Regression Model

Two new location-scale regression models are developed from the SKwW distribution in this section. The basic structure of the two regression models is the same. The first location-scale regression model, denoted by log-SKwW1 (LSKwW1), is developed by applying the transformation and considering the reparametrization and . Given that the CDF of the SKwW distribution is expressed as , but . Therefore, . Then

The survival function of the LSKwW1 regression model can also be expressed as

Consequently, the PDF of LSKwW1, , can be obtained by differentiating equation (43) as

Let in equation (46); then, the PDF of the standardized random variable is obtained as

Using the following structure, the density function can be used to develop the LSKwW1 regression model: where is the location parameter, is the set of covariates, is the vector of unknown regression coefficients, and is the error term. The error term follows the LSKwW1 distribution. The parameters of the unknown regression model can be estimated using the density function via maximum likelihood estimation method. The parameter estimates can be obtained by maximizing the log-likelihood function given as

The second regression model, denoted LSKwW2, is obtained by linking covariates to the distribution parameters as where and are additional unknown regression parameters. Thus, the estimates of the regression parameters can be obtained by maximizing the log-likelihood function given as

7.1. Regression Application

The applications of the two developed regression models are demonstrated in this section. The dataset used for the application is obtained from Zamanah et al. [42] and consists of the survival times (in years) until the onset of hypertension of 119 random samples and their corresponding gender from the Bolgatanga Regional Hospital in the Upper East Region of Ghana. The effect of gender on the survival times is investigated using the regression models. The survival times with gender (, ) are presented in Table 13.

The two regression models fitted for are given as follows: (a)LSKwW regression model 1 (LSKwW1): (b)LSKwW regression model 2 (LSKwW2): , , and

The performance of the models is compared with the log-harmonic mixture Weibull Weibull (LHMWW) regression model developed by Zamanah et al. [42] using Akaike information criteria (AIC) and Bayesian information criteria (BIC). Also, the Cox-Snell residual analysis is used to assess the adequacy of the fitted model. Given estimated parameters as and survival function as , the Cox-Snell residual is defined as . If the model is adequate, the residual is expected to follow the standard exponential distribution. Again, the Wald test is performed on the model parameters. The Wald test is used to check if model parameters are significantly different from a specific value. In this study, the Wald test is used to test if gender contributes significantly to the survival times of the hypertension patients. That is, the null hypothesis is used to test if the parameter estimate is significantly different from zero.

Table 14 shows the parameter estimates, standard errors, values, AIC, and BIC statistics of the fitted regression models. It can be observed that LSKwW1 has the least statistics as compared to the other regression models. This suggests that the LSKwW1 fits the data better than the other models. Also, it can be observed that gender (using female as the reference) is statistically insignificant for LSKwW1 regression model, whiles it is statistically significant for LSKwW2 and LHMWW regression models. This is confirmed by performing a Wald test on the predictor. The test statistic and values are presented for each regression model in Table 14. It can be observed that gender is not significantly different from zero for LSKwW1 regression model as the value is greater than 5% significant level, whiles gender is significant for LSWwW2 and LHMWW regression models as the values associated with the test for these models are less than 5% significant level.

Figure 18 shows the Cox-Snell residuals for the fitted models. It can be observed that LSKwW regression model 1 and LHMWW regression model compete in modelling the dataset. Therefore, the LSKwW regression models can serve as alternative regression models for modelling lifetime data.

8. Conclusions

The Secant Kumaraswamy (SKw-G) family of distributions was developed in this study. Statistical properties such as the quantile function, moments, moment generating function, and mean residual life function were derived for the family of distributions. Five special cases of the family of distributions were presented using Weibull, Chen, Bur XII, Gompertz, and Fréchet distributions as the baseline distributions. The maximum likelihood estimation method was used to obtain estimators of the family of distributions, and Monte Carlo simulation was used to show the desirability of the estimators. Two location-scale regression models were developed for the Secant Kumaraswamy Weibull distribution. Using several real datasets, the usefulness and flexibility of the family of distributions and the regression models are demonstrated. The results showed that the special cases of the family distributions and its regression models outperformed several competing distributions. This shows that the SKw-G family of distributions can serve as alternative distribution in modelling real datasets.

Data Availability

The data supporting the results are all contained in the article.

Conflicts of Interest

The authors declare that there is no conflict of interest with regard to this study.